Pipl and Spokeo Script
New here? Learn about Bountify and follow @bountify to get notified of new bounties! x

I need a script that can take a Google Sheet (populated with first name, last name, date of birth, and state) and run the rows through the paid APIs of Pipl and Spokeo to locate an address and output into another Sheet or CSV, etc.

Do you have an easy way to share your Pipl API key if you already have one? I'm facing too much restrictions with the test one. And I can't find any API for Spokoe, it was apparently removed some time ago.
Bhullnatik almost 4 years ago
I don't think Spokeo's service can be 'nicely' accessed through a script. Spokeo's TOS = " b) Other than connecting to Spokeo.com by http request using a web browser, You may not attempt to access Spokeo’s servers or Spokeo.com by any means. In particular, You are prohibited from scraping, crawling, data-mining, or using any robot, spider, or other automatic device to send queries to the Spokeo’s servers or Spokeo.com. You may not use Spokeo.com to compile data or images for use by any commercial entity."
SkeletonSlayerz almost 4 years ago
awarded to pieceofchalk
Tags
api
google

Crowdsource coding tasks.

1 Solution


my solution https://gist.github.com/pieceofchalk/2e087bbe46009cef54fc if find more than one address, append column for each to csv row

python pipl api library:

git clone https://github.com/piplcom/piplapis-python.git

or

wget https://github.com/piplcom/piplapis-python/archive/master.zip
cd piplapis-python
sudo python setup.py install

google spreadsheet lib :

pip install gspread

or

git clone https://github.com/burnash/gspread.git
cd gspread
python setup.py install

solution script:
pipl.py

#!/usr/bin/env python

from piplapis.search import SearchAPIRequest
from piplapis.data import Person
from piplapis.data.fields import Address, Name, DOB
import gspread
from datetime import datetime
import csv
import json
from oauth2client.client import SignedJwtAssertionCredentials


google_account = 'your_account@gmail.com'
api_key = 'sample_key'  # pipl api key
output_file = 'output.csv'


def write_to_csv(row):
    with open(output_file, 'a') as f:
        writer = csv.writer(f)
        writer.writerow(row)


def pipl(row):
    fields = [Name(first=row[0], last=row[1]),
              DOB.from_birth_date(datetime.strptime(row[2], '%Y-%m-%d').date()),
              Address(country='US', state=row[3])]
    request = SearchAPIRequest(api_key=api_key, person=Person(fields=fields))
    response = request.send()
    return response


def proccess(gc):
    wks = gc.open("pipl").sheet1
    wks_list = wks.get_all_values()
    for row in wks_list:
        search = pipl(row)
        if search.person:
            for address in search.person.addresses:
                row.append(address._display)
            write_to_csv(row)
        elif search.possible_persons:
            adress_list = []
            for person in search.possible_persons:
                adress_list += [address._display for address in person.addresses]
            adress_list = set(adress_list)
            row += list(adress_list)
            write_to_csv(row)


if __name__ == '__main__':
    import argparse
    parser = argparse.ArgumentParser()
    parser.add_argument('--json-key', type=str, default=False, help='OAuth2')
    parser.add_argument('--passwd', type=str, default=False, help='if you have 2-step verification, generate app password')
    args = parser.parse_args()
    if args.json_key:
        json_key = json.load(open(args.json_key))
        scope = ['https://spreadsheets.google.com/feeds']
        credentials = SignedJwtAssertionCredentials(json_key['client_email'], json_key['private_key'], scope)
        gc = gspread.authorize(credentials)
    elif args.passwd:
        gc = gspread.login(google_account, google_password)
    else:
        raise Exception('need json_key for OAuth2 or password if you have 2-step verification')
    proccess(gc)

my google spreadsheet: https://docs.google.com/spreadsheets/d/1Bdcsom1lNBr8F7m8o8JlVFGlTL1-HyHDVUyEJiDsiOM

output csv:

Clark,Kent,1974-03-09,KS,"Kansas, United States"
Barak,Obama,1961-08-04,DC,Ukraine,"K St Nw, Washington, District Of Columbia","Nw K Street, Washington, District Of Columbia","Washington Dc, District Of Columbia","Washington, United States","Yakutsk, Russian Federation","Chicago, Illinois","Kalifornsky, Alaska","Los Angeles, California",United States,"Jakarta, Indonesia","Murmansk, Russian Federation","461 Dirksen Senate Office Building W, Washington, District Of Columbia","Washington, District Of Columbia","Honolulu, Hawaii","New York, New York","Illinois City, Illinois","Nw Pennsylvania Avenue, Washington, District Of Columbia"

about spokeo.com I agree with @SkeletonSlayerz, "You are prohibited from scraping, crawling, data-mining, or using any robot, spider, or other automatic device to send queries to the Spokeo's servers or Spokeo.com"

Isn't the e-mail/password authentication deprecated and disabled? https://developers.google.com/identity/protocols/AuthForInstalledApps?csw=1
Bhullnatik almost 4 years ago
It works for me because I use 2-step verification
pieceofchalk almost 4 years ago
you can use OAuth2 Authorization, http://gspread.readthedocs.org/en/latest/oauth2.html gc = gspread.authorize(OAuth2Credentials) wks = gc.open("pipl").sheet1
pieceofchalk almost 4 years ago
Yeah I tried it and it works for me too, that's why I was confused I thought Oauth2 was mandatory. Anyway your solutions looks good!
Bhullnatik almost 4 years ago
@needmoredata is it works for you?
pieceofchalk almost 4 years ago
@pieceofchalk - I can see the viability of the code, but can't get OAuth2 to work nicely...
needmoredata almost 4 years ago
if you want to work with OAuth2, you need to create new Client ID on https://console.developers.google.com download it as json file and share your spreadsheet with email of this new Client. new version that works with both methods https://gist.github.com/pieceofchalk/2e087bbe46009cef54fc
pieceofchalk almost 4 years ago
run: python pipl.py --passwd your_password or: python pipl.py --json-key yourproject-2334323343.json
pieceofchalk almost 4 years ago