經網路爬文許久也無法解決的JSON抓取問題

107 views
Skip to first unread message

璨德林

unread,
Dec 10, 2017, 10:30:54 PM12/10/17
to python.tw

以下是我的程式碼


#coding:utf-8
from bs4 import BeautifulSoup
import requests
import json
import pymongo
import urllib
import urllib2
from pip.download import user_agent
from fake_useragent import UserAgent

ua = UserAgent()
headers = {'User-Agent': ua.random}
url = 'https://www.dailyfx.com.hk/calendar/index.html'

def dealData(url):
    client = pymongo.MongoClient('localhost', 27017)
    guoke = client['guoke']
    guokeData = guoke['guokeData']
    web_data = requests.get(url,headers=headers)
    datas = json.loads(web_data.text)
    print datas.keys()
    for data in datas['figure']:
        guokeData.insert_one(data)

def start():
    urls = ['https://www.dailyfx.com.hk/inc/cal_qry.php?symbol=rmb-usd-eur-jpy-gbp-chf-aud-cad-nzd&section=event-holiday-figure&sdate=2017-12-06&edate=2017-12-06']
    for url in urls:
        dealData(url)

start()



我的目標是用這程式抓下網頁的JSON資料,這程式在來源網站有實驗過可以抓下其他網站的資料因此應該是具有可行性的

為此我做了其他的實驗打算先簡化過程改成先嘗試抓下JSON檔案再來想傳輸到資料庫的事情,可是依然失敗了

import requests
import json

 
user_agent='Mozilla/4.0 (compatible;MISE 5.5;Windows NT)'
headers={'User-Agent':user_agent}
response = requests.get('https://www.dailyfx.com.hk/inc/cal_qry.php?symbol=rmb-usd-eur-jpy-gbp-chf-aud-cad-nzd&section=event-holiday-figure&sdate=2017-12-09&edate=2017-12-09',headers=headers)
data = json.loads(response)  
print data


請問到底應該怎麼樣改比較好呢?




Sonic Yang

unread,
Dec 11, 2017, 7:47:54 AM12/11/17
to pyth...@googlegroups.com
hi
要用response.text 

best regards 
alingo

On Monday, December 11, 2017, 璨德林 <davidli...@gmail.com> wrote:

Reply all
Reply to author
Forward
0 new messages