Rank: 2 Rank: 2

1^# 跳轉到 » 倒序看帖

字體大小: tT

發表於 2015-3-31 21:11 | 只看該作者

Retrieving data from AJAX generated website

May I ask if anyone ever tried to retrieve data from website which is created by AJAX in website? What kind of tools do you use?

Thanks in advance~

0

0

Jackass_TMxCK

初級會員

Rank: 1

2^#

發表於 2015-3-31 22:11 | 只看該作者

Do you know or even hear of "Same Origin Policy"?

If you solve this restriction, then it has no difference other than a normal website with RESTful API feeding data

TOP

ronstudio

中級會員

Rank: 2 Rank: 2

3^#

發表於 2015-4-1 08:18 | 只看該作者

回覆 2# Jackass_TMxCK

Thanks for the help! I'll do further research regarding this area!

TOP

Jackass_TMxCK

初級會員

Rank: 1

4^#

發表於 2015-4-1 08:57 | 只看該作者

回覆 Jackass_TMxCK

Thanks for the help! I'll do further research regarding this area!
ronstudio 發表於 1/4/2015 08:18 AM

I think of something, server side cURL has no limitations on some origin policy

Check out PHP simple proxy

TOP

ronstudio

中級會員

Rank: 2 Rank: 2

5^#

發表於 2015-4-1 17:24 | 只看該作者

Thanks, so far I notice most solution people using are loading some kind of browser module in order to simulate the web browsing in order to take the data. At least in Perl, there is a module which simulate firefox to achieve this.

This is doable, but really takes more effort and need furtfurther study

TOP

chi251155

中級會員

Rank: 2 Rank: 2

6^#

發表於 2015-4-1 19:32 | 只看該作者

回覆 1# ronstudio

curl or wget

TOP

ronstudio

中級會員

Rank: 2 Rank: 2

7^#

發表於 2015-4-4 23:48 | 只看該作者

The problem of using simple wget is the content of the page are generated with AJAX javascript. So when I use the simple wget, it gets the html structure but nothing of the content which I need to parse.

TOP

toylet 發短消息加為好友 toylet 當前離線 Mr. Man-wai CHANG (張記) UID 127573 帖子 86652 精華 0 積分 18065 EPC Dollar 18065 來自 Hung Hom, Hong Kong 註冊時間 2008-10-29 最後登錄 2026-3-19 Banned	8^# 發表於 2015-4-5 00:02 \| 只看該作者提示: 作者被禁止或刪除內容自動屏蔽

	TOP

chi251155

中級會員

Rank: 2 Rank: 2

9^#

發表於 2015-4-5 00:06 | 只看該作者

回覆 7# ronstudio

use wget and curl to get the message of ajax. you know ajax is a method of communication right? you can easily find the url of the data source, it is easier than crawling the web page, as the contents are always formatted in json or xml.

TOP

Jackass_TMxCK

初級會員

Rank: 1

10^#

發表於 2015-4-5 00:16 | 只看該作者

回覆 ronstudio

use wget and curl to get the message of ajax. you know ajax is a method of co ...
chi251155 發表於 5/4/2015 12:06 AM

Same Origin Policy....

TOP

Retrieving data from AJAX generated website

[收藏此主題] [關注此主題的新回覆]

[通過 QQ、MSN 分享給朋友]