Google 网上论坛不再支持新的 Usenet 帖子或订阅项。历史内容仍可供查看。

Data Capture

已查看 227 次
跳至第一个未读帖子

jean-...@buechi-schmitt.ch

未读,
2018年3月13日 13:04:062018/3/13
收件人
Hello,
I work with Swift Forth
I want to capture the data that sends me a server in real time.
Example: Stock Exchange software displays real-time quotes on the screen.
Is it possible to retrieve these data as and when they arrive. (Streaming)?
thank you for your reply
Jean-Pierre Schmitt

al...@rivadpm.com

未读,
2018年3月15日 05:28:382018/3/15
收件人
What is the source of the data? File? Push via RSS? Some other kind of messaging?

jean-...@buechi-schmitt.ch

未读,
2018年3月15日 16:31:062018/3/15
收件人
The data source is the net.
Let's say things differently,
I'm signing up on a server (ProrealTime) as follows:
: ProRealTime S "https://www.prorealtime.com/en/workstation?launcher=1"> SHELL;

Once the connection is established, the server sends me information that appears on the screen in a Web Page.
These are the information I want to recover in my program for treatment.

Thank you for your reply.
cordially
Jean-Pierre Schmitt














Howerd

未读,
2018年3月17日 01:21:282018/3/17
收件人
Hello Jean-Pierre,

I have some good news and some bad news for you - first the good news :
You have some extra and/or missing spaces in your Forth code, it should be :
ProRealTime

This is because Forth has a simple rule that words are bounded by spaces.

Your code has a space after the S and after the > .
: ProRealTime S "https:..."> SHELL;

To create a string the Forth word S" is used, not S " , likewise >SHELL , not > SHELL . Once you have got the hang of this everything is easy.

The bad news is that ProRealTime is just passing a string the the underlying operating system, in my case Windows7, the string looks like a URL so Windows passes it to its default browser, in my case Firefox, and Firefox displays the HTML coming from the web site.
You can right click on the web page and select "View Source", and you can then copy and paste the raw HTML into a file and search for the data you want.

But if you want to capture the raw HTML automatically you enter the world of "web scraping" and "data mining" and the war that is currently waging between people who publish data for human beings to read and people who want to gather this information using automated programs, filter it in useful ways and sell it on to advertising revenue companies.

So what you need is a community of people dedicated to Free Software and Data who are working continuously on defeating the advertising revenue people and their attempts to stop you gathering their data.
As far as I know there is no such community of people using Forth to do this - it would be fun it there were...

In true Forth style, I would recommend using the simplest method, probably Python and something like BeautifulSoup.
e.g. https://realpython.com/blog/python/python-web-scraping-practical-introduction/ .

If you have been given this as a student assignment to use Forth, then feel free to ask more questions here on clf.

The reason I responded to your post is that I have just been trying to do some web scraping. I looked at Forth, but there is more support in other languages.
BTW I failed because of CloudFlare - another of the barriers on the internet...

Its a fascinating subject, and Forth is a fun language - I hope I haven't put you off - which language you use is really rather unimportant compared to finding out how to work around the barriers :-)

Please keep us updated on your progress...

Cheers,
Howerd

Elizabeth D. Rather

未读,
2018年4月21日 21:40:512018/4/21
收件人
Howerd is clarifying the issue nicely. I will add that if you really
want to do this, you will need to enter into a business relationship
with whoever is providing the data, and can give you access to the
stream. Then I would discuss with sup...@forth.com the best way to
approach writing your code. There's a lot of experience with data
acquisition at FORTH, Inc., and they can almost certainly be of help to
you. In fact, if you have this conversation first, you can probably
express yourself more clearly when talking with the data provider.

Cheers,
Elizabeth

--
Elizabeth D. Rather
FORTH, Inc.
6080 Center Drive, Suite 600
Los Angeles, CA 90045
USA

Mark Wills

未读,
2018年4月23日 08:16:372018/4/23
收件人
It's possible yes. You need sockets. Your stock exchange provider probably
offers an API which will expose a JSON interface allowing you to get the
data for the tickers you want without having to decode HTML.

Generally the data is deliberately delayed by some factor. The more money
you pay the more bandwidth (queries) you can submit, and with a lower latency.

The HFT stuff relies on massive bandwidth and millisecond timing. That's
how they make their money.
0 个新帖子