python - how to iterate through a json that has multiple pages -
i have created program iterates through multi-page json object.
def get_orgs(token,url): part1 = 'curl -i -k -x -h "content-type:application/json" -h "authorization:bearer ' final_url = part1 + token + '" ' + url pipe = subprocess.popen(final_url, shell=false,stdout=subprocess.pipe,stdin=subprocess.pipe) data = pipe.communicate()[0] line in data.split('\n'): print line try: row = json.loads(line) print ("next page url ",row['next']) except : pass return row my_data = get_orgs(u'mybeearertoken',"https://data.ratings.com/v1.0/org/576/portfolios/36/companies/")
the json object below:
[{results: [{"liquidity":"strong","earningsperformance":"average"}] ,"next":"https://data.ratings.com/v1.0/org/576/portfolios/36/companies/?page=2"}]
i using 'next' key iterate,but @ times points "invalid page" ( page doesn't exist). json object have rule how many records there on each page ? in case , use estimate how many pages possible.
edit: adding more details json has 2 keys ['results','next']. if there multiple pages, 'next' key has next page's url (as can see in output above
). else , contains 'none'. but, problem @ times, instead of 'none' , points next page (which not exist). so, want see if can count rows in json , divide number know how many pages loop needs iterate through.
in opinion using urllib2 or urllib.request better option curl in order make code easier understand, if that's constraint - can work ;-)
assuming json-response in 1 line (otherwise json.loads throw exception), task pretty simple , allow fetch amount of items behind result key:
row = [{'next': 'https://data.ratings.com/v1.0/org/576/portfolios/36/companies/?page=2', 'results': [{'earningsperformance':'average','liquidity': 'strong'}, {'earningsperformance':'average','liquidity': 'strong'}]}] result_count = len(row[0]["results"])
the alternative solution using httplib2 should (i didn't test this):
import httplib2 import json h = httplib2.http('.cache') url = "https://data.ratings.com/v1.0/org/576/portfolios/36/companies/" token = "your_token" try: response, content = h.request( url, headers = {'content-type': 'application/json', 'authorization:bearer': token} ) # convert response string content = content.decode('utf-8') # charset header try: object = json.loads(content) result_count = len(object[0]["results"]) # yay, got result count! except exception: # if server responds garbage pass except httplib2.httplib2error: # handle exceptions, here's list: https://httplib2.readthedocs.io/en/latest/libhttplib2.html#httplib2.httplib2error pass
for more on httplib2 , why amazing suggest reading dive python.
Comments
Post a Comment