Call the node js script from PHP and retrieve all data back to PHP

401 views
Skip to first unread message

can...@gmail.com

unread,
Feb 22, 2018, 4:42:11 PM2/22/18
to headless-dev
Hello 

I managed to write a nodejs script that is scrapping info. 
I can print the info while I run the node script.

Now I would like to call that nodescript from PHP and output all its data back to PHP.

How can I do that ?

Thanks

PhistucK

unread,
Feb 22, 2018, 5:12:35 PM2/22/18
to Charles Jeremy Colnet, headless-dev
It does not sound like a question related specifically to Headless Chrome (interfacing between Node.js and PHP), so this is probably not the right group for it. Maybe stackoverflow.com (though you will need to show some code that you tried and did not work, regardless of where you ask this).


PhistucK

--
You received this message because you are subscribed to the Google Groups "headless-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to headless-dev+unsubscribe@chromium.org.
To post to this group, send email to headle...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/headless-dev/6097b374-43ed-42b4-81fe-fda3a28b3771%40chromium.org.

can...@gmail.com

unread,
Feb 22, 2018, 5:22:19 PM2/22/18
to headless-dev
Hello

Yes it is chrome headless related here is my code


'use strict';

const puppeteer = require('puppeteer');

(async() => {


const browser = await puppeteer.launch({headless: true});
const page = await browser.newPage();
await page.setRequestInterception(true);
page.on('request', request => {
if (request.resourceType() === 'image')
request.abort();
else
request.continue();
});
await page.goto('http://example.com');


const TITLE_PAGE = '.main-title h1';

let innerTitle = await page.evaluate((sel) => {
let html = document.querySelector(sel).innerHTML;
console.log(html);
return html.toString();
}, TITLE_PAGE);
console.log(innerTitle);

await page.waitForSelector(".date-list");

const REAL_TIME = '.date-list';

let innerTimes = await page.evaluate((sel) => {
let arTimes = document.querySelectorAll(sel);
let sTimes = "";
for(var i=0; i<arTimes.length;i++)
{
sTimes += arTimes[i].innerHTML+"|";
}
return sTimes;
}, REAL_TIME);
console.log(innerTimes);
console.log("load page");
/*so far I am printing in CLI*/
/*looking to export these data to php*/

await browser.close();
})();

can...@gmail.com

unread,
Feb 23, 2018, 7:10:05 AM2/23/18
to headless-dev
Hello,

When I am using 

process.stdout.write("init");

outside of (async() => {

everything is working I can print and PHP has the data back.

but when I am printing the data inside (async() => {  then it returns 0;

I understand that it is async but still, is there a way to make it work ?

Any suggestions ? 
Thanks

Alex Clarke

unread,
Feb 23, 2018, 7:32:02 AM2/23/18
to can...@gmail.com, headless-dev
I'm sorry but this isn't really the right forum for php / nodejs interoperability questions.  None of us here use php and few of us use nodejs so we don't know how to help.

--
You received this message because you are subscribed to the Google Groups "headless-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to headless-dev+unsubscribe@chromium.org.
To post to this group, send email to headle...@chromium.org.

can...@gmail.com

unread,
Feb 23, 2018, 8:06:33 AM2/23/18
to headless-dev
Hello Alex,
I understand completely. I thought that some people on this forum were using chrome headless for web scrapping thus my question regarding how to collect back the data.

Thank you

Reply all
Reply to author
Forward
0 new messages