Get air flights out of Tripit API
New here? Learn about Bountify and follow @bountify to get notified of new bounties! x

Hi. Looking for some help with a Tripit API code in Python. I'm not up to the task.

Been using this Python app to get data from the Tripit API. Ahmedangu kindly upgraded it recently to automatically do the auth. https://github.com/ahmedengu/tripit-to-flightdiary.

It uses this API. https://tripit.github.io/api/doc/v1/

I've tried playing with it but it grabs too much data for me to usefully use as it's well beyond my Excel capabilities to turn it into a simple list. I just want to grab all objects of a type eg. air (as opposed to the complete trips it currently gathers). Objective is to use it for a tax diary to show days in each place.

But the problem is the API paginates. And I don't trust myself to merge the JSON into one file properly doing it again and again. So I'd like someone to adjust it so it can dump all the activities of the selected type into a single file by concatenating the different pages (note you can't refer to a page beyond what it has so you'd need to discover how many it has and stop!)

These are the object types: main interest is air, rail, lodging, cruise, car
air|activity|car|cruise|parking
directions|lodging|map|note|
rail|restaurant|transport|weather

I'm happy playing with Python but if anyone fancies make it into a really user friendly app I think a few people might even pay for it as donationware etc.

Should the result be a day-by-day breakout of all the plan objects with their costs? And do you just need the start date of each object, or the whole span (e.g. a cruise that lasts a few days)?
CyteBode 10 days ago
Ideally yes - departure and arrival date/time are both important. Main concern is flights for me, but I guess other people might want rail etc. I only need this because the workaround in openflights.org doesn't give arrivals. Ideally dropping down the page in date order, all of the fields. If that can't be done...maybe just flights.
sebmack 10 days ago
Okay, so as an output format do you just want JSON, but with all the objects over all the pages in one list?
CyteBode 10 days ago
Greetings, if you interested i can create a single page web application that authorize with a click of a button and show the data into a table that can be exported to csv,json .... However the problem i'm facing is that i don't have any trips in my account to work with i just have the examples from the documentation .. so i gonna start now by creating the app authentication feature and populating the data into a table then gonna show you the results for comments ... a web app solution would take couple of iterations and i already started late so bare with me :) Also my application can be used as mobile app the app can be used here (when it's done): https://ahmedengu.github.io/Tripit_Flights_Export/ the repo: https://github.com/ahmedengu/Tripit_Flights_Export
ahmedengu 10 days ago
Jeez. You two are amazing coders. So I was after CSV output and I'll answer Cytebode's comment below with a format. Why don't we let Cytebode finish that as he's got that working nicely. But I think Ahmed's app idea is also awesome and so happy to tip. How much do you want Ahmed? Are you guys happy to add Cytebode's work onto Ahmed's fork and then have Ahmed make it into an app?
sebmack 10 days ago
Yeah sure Cytebode can finish this one any way i may need more time to finish as tripit api doesn't support Cross-origin resource sharing which makes it hard to use the api in the browser so i gonna create a proxy that makes the requests to the api then pass it to the browser. Are you guys happy to add Cytebode's work onto Ahmed's fork and then have Ahmed make it into an app? Well it's another approach as i'm using javascript in the app however Cytebode's logic would help me
ahmedengu 10 days ago
So far the token generation part is done https://ahmedengu.github.io/Tripit_Flights_Export i'm waiting for CyteBode to deliver the logic you want then i can use it to finish the exporting part
ahmedengu 9 days ago
Cytebode - just letting you know there's a new bounty out there.
sebmack 8 days ago
I know, but I'm still working on it. I don't have as much free time this week so it'll take longer. I'm mostly done with the code, but I still need to deal with the user-friendliness, testing and writing the submission text.
CyteBode 8 days ago
Ahmedangu - Cytebode has put together some fantastic code in another app. Needs a couple more libraries. How much would you need to tur it into an app that could be used on a desktop?
sebmack 5 days ago
It would take about a day ... but i'm busy for couple of days
ahmedengu 5 days ago
Ahmedangu sounds great.
sebmack 1 day ago
awarded to CyteBode

Crowdsource coding tasks.

1 Solution

Winning solution

Here is my solution:

activities.py

from collections import OrderedDict
import json
import sys

import requests
from requests_oauthlib import OAuth1


def join_duplicate_keys(ordered_pairs):
    d = OrderedDict()
    for k, v in ordered_pairs:
        if k in d:
           if type(d[k]) == list:
               d[k].append(v)
           else:
               newlist = []
               newlist.append(d[k])
               newlist.append(v)
               d[k] = newlist
        else:
           d[k] = v
    return d


VALID_TYPES = ("air", "activity", "car", "cruise", "parking", "directions",
               "lodging", "map", "note", "rail", "restaurant", "transport",
               "weather")
VALID_TRAVELER = ("true", "false", "all")
VALID_PAST = ("true", "false", "all")


def get_activities(type_, traveler = "true", past = "all", page_size = 100,
                   reversed_ = False, verbose = True):
    assert type_ in VALID_TYPES
    assert traveler in VALID_TRAVELER
    assert past in VALID_PAST

    if past == "all":
        past = ("false", "true")
    else:
        past = (past,)

    with open('creds.json') as f:
        creds = json.load(f)

    auth = OAuth1(creds['CLIENT_KEY'], creds['CLIENT_SECRET'],
        creds['OAUTH_TOKEN'], creds['OAUTH_TOKEN_SECRET'])

    objectTag = "%sObject" % type_.title()

    lst = []

    for period in past:
        if verbose:
            if period == "true":
                print("Fetching past activities")
            else:
                print("Fetching future activities")

        page_num = 1
        while True:
            if verbose:
                sys.stdout.write("Page %d of " % page_num)

            url = "".join(["https://api.tripit.com/v1/", 
                           "list/object/",
                           "traveler/%s/" % traveler,
                           "past/%s/" % period,
                           "format/json/",
                           "type/%s/" % type_,
                           "page_num/%d/" % page_num,
                           "page_size/%d" % page_size])

            response = requests.get(url, auth=auth)
            assert response.headers["Content-Type"] == "application/json"

            json_dict = json.loads(response.text,
                            object_pairs_hook = join_duplicate_keys)

            objects = json_dict.get(objectTag, [])
            if isinstance(objects, dict):
                lst.append(objects)
            else:
                for activity in json_dict.get(objectTag, []):
                    lst.append(activity)

            max_page = json_dict["max_page"]
            if verbose:
                sys.stdout.write("%s\n" % max_page)
                sys.stdout.flush()

            if page_num == int(max_page):
                break
            page_num += 1

        if verbose:
            print("")

    if verbose:
        if len(lst) == 1:
            print("Fetched 1 activity")
        else:
            print("Fetched %d activities" % len(lst))

    if reversed_:
        print("reversed")
        lst.reverse()

    return {objectTag: lst}


def main():
    import argparse

    parser = argparse.ArgumentParser(description =
        "Fetch all the activities of a certain type from TripIt.")
    parser.add_argument("type", choices = VALID_TYPES,
        help = "The type of activity (required).")
    parser.add_argument("--past", "-p", default="all", choices = VALID_PAST,
        help = "Whether to fetch only past or future activities, or both.")
    parser.add_argument("--traveler", default="true", choices = VALID_TRAVELER,
        help = "Fetch only trips where the user is a traveler, not a traveler or both.")
    parser.add_argument("--page_size", "-n", default=100, type=int,
        help = "Specify the number of activities per page.")
    parser.add_argument("--reversed", "-r", action="store_true",
        help = "Reverse the order (so it's oldest to newest instead).")
    parser.add_argument("--output", "-o", default="",
        help = "Output to a file.")

    args = parser.parse_args()

    verbose = args.output != ""
    json_object = get_activities(
        args.type,
        args.traveler,
        args.past,
        args.page_size,
        args.reversed,
        verbose
    )

    if not verbose:
        print(json.dumps(json_object, indent=2))
    else:
        with open(args.output, "w+") as output:
            output.write(json.dumps(json_object, indent=2))


if __name__ == '__main__':
    main()

Command line usage

The script goes through all the pages of activities of a certain type and dumps them all into a single list.

It can output to a file if a filename is specified with the -o or --output switch, with progress information being printed out to stdout. If the switch isn't used, it outputs directly to stdout without any progress information.

// Will output all the air activities to the output.json file
python activities.py air -o output.json

// Will output all the car activities to stdout, which can be piped to a file
python activities.py car > output.json

The order of the activities is as it comes from the TripIt API by default, which is from newest to oldest. To reverse it, the -r or --reversed switch can be specified with no argument.

By default, both past and future activities are fetched. To get only one or the other, the -p or --past switch can be used with a boolean argument (true for past activities, false for future activities).

// Will output only the past air activities
python activities.py air -p true

// Will output only the future air activities
python activities.py air -p false

The last two switches (traveler and page_size) just mirror the filtering functions of the TripIt API.

Python usage

The get_activities(...) function can also be used directly in Python from another script. It returns a dict with the same structure as the output JSON, i.e.: {"ActivityObject": [<Objects>]}. The arguments are mostly the same as with the command line usage, except that it can't output to a file, and there's a verbose argument to toggle whether or not to print out progress infomation.

import activities

for activity in get_activities("air")["AirObject"]:
    # Print the activity's display name
    print(activity["display_name"])

Edit: Renamed the script to activities.py. Renamed plan to activity. Switched to using OrderedDict to preserve the order of the key-value pairs. Fixed a small bug with verbose output. Added order reversal feature. Added help descriptions to the arguments and switched the type argument to a positional argument. Added clearer usage instructions.

Hi. I think I do. I can tip you $25 more if you can do this if OK? For all of the items I want to get all of the segment detail inc start/end or depart/arrival times. Is there a way to do this within the hierarchy eg... Trip Activity Segment Origin Destination (address detail if there) etc...? I guess the worry is the sub-record content is different between types of activity but can you programatically suck that up into a row?
sebmack 10 days ago
Hi. 1. II do need the processing. Is $25 more OK? For all of the items I want to get all of the segment detail inc start/end or depart/arrival times. Is there a way to do this within the hierarchy eg... Trip Activity Segment Origin Destination (address detail if there) etc...? I guess the worry is the sub-record content is different between types of activity but can you programatically suck that up into a row regardless of type?
  1. Can you confirm how the arguments work? And maybe give me a paragraph to add to the readme? Or create a new fork from Ahmed's? Then we can get the benefits of his automation of the tokens and the benefits of the new 'Activity functionality' - let's call it activities.py?
sebmack 10 days ago
I'm not quite sure if I really understand what you want. Could you give me an example spreadsheets with 2-3 activities manually formatted the way you want it? I'm mostly confused by how you want the hierarchy to get flattened in rows, so I would like to see some activities with one or more segments. I'll update my post with more thorough usage instructions.
CyteBode 10 days ago
Hi, example would be: Trip Name|TripID|Type|ActivityID|ActivityCost (show in the first d not for each segment)|TravellerYN |URL|BookingSite|SupplierConfirmation| BookingDate|BookingSitePhone|Traveller|ticketNumber| AirlineCode|Aircraft|ServiceClass|FlightNumber|StartCountry| StartCityName|StartAirport|StartTerminal|StartLat|StartLong|StartDate |StartTime|EndCountry|EndCityName|EndAirport|EndTerminal| EndLat|EndLon|EndDate|EndTime|Stops|Distance| Obviously for rail and road there will be some differences. There won't be terminals. It's Station name not Airport Name. Maybe just include the rail fields as trailing fields...and if fields are blank for the whole set don't populate them. I guess the answer is just to get everything from the data.
sebmack 9 days ago
I still don't understand how exactly you want this to be flattened. For each air activity, there can be multiple segments and multiple travelers, so do you want one row per combination? For example, if I have 2 travelers (A, B) and 2 segments (1, 2), that would create 4 rows consisting of A1, A2, B1 and B2? Should the repeating data be blank for subsequent rows for clarity?
CyteBode 9 days ago
Furthermore, since you want a specific (and rather extensive) subset of the data, in a specific order, with specifically different header names than their key names in the data returned by the API, that would require a ton of manual work to handcraft an exporting function specific for each and every type of activity. I can't do that for just a $25 tip. I was thinking more of extracting a much more succinct amount of data which didn't differ much from one type of activity to another, such as just the cost and start/end dates.
CyteBode 9 days ago
Given the complexity of this new problem, I think you should create a new bounty, because it's going way beyond the scope of what was specified in this bounty.
CyteBode 9 days ago
Okay, I currently have a workable solution for the new problem. As I said earlier, I'm inviting you to make a new bounty because this is a completely different solution to a completely different problem. I actually followed a data-driven approach so I don't have to tediously hardcode the extraction logic for every type of activity. Instead, I'm defining a recursive extraction schema with reusable parts for every type. Then I have this single recursive algorithm to extract the data for any type.
CyteBode 9 days ago
Hi. I think travellers can be concatenated into a single cell as Name-Ticket Number, Name-Ticket Number etc. The other booking fields can be repeated, and the segments really form the core. I think let's remove trip cost from this output as it could cause confusion; but could we put a switch on the program to allow a similar flat output of JUST the trip info that enables people to join it? The rest of the field order I am not wedded to - if it saves cash, let's just import from the different kinds of activity in the order they come in the API? Still needs a new bounty?
sebmack 9 days ago
My solution currently follows the schema that you gave me earlier exactly and the repeat data is output as empty cells. Example output: https://pastebin.com/n8PHvwNt (Data is bogus and incomplete, but the solution will take everything if it's present). Since there is no repetition, the cost only appears once. Just taking the data as-is from the API doesn't work well because it doesn't include any empty data (rows would have variable lengths) and there's also a ton of superfluous data.
CyteBode 9 days ago
As for making a new bounty, that would be good because this discussion area is getting crowded. I would be okay with a $25 price; I can take into account that you gave me a $50 tip earlier for a relatively small amount of effort and that (I suppose) you'll be awarding me this bounty as well. Plus my recursive schema solution made things more straightforward than what I had initially envisioned.
CyteBode 9 days ago
Hi. OK. Is there any etiquette on a new bounty aimed at one person? Or shall I tip another $25. Also - per earlier message that crossed paths with yours...is there a) a way to remove trip cost because it's the one that's confusing if repeated. The rest should repeated if possible because I foresee people manipulating in Excel using filters. b) A way to use the same recursive approach to extract the TRIPS with trip cost from the API using trip instead of object. That way if people need the cost, they can extract separately and Vlookup etc? How is that? PS let me know re new bounty or tipping. Happy to do either.
sebmack 9 days ago
Just make the bounty saying it's aimed at me and ignore any solution from anyone trying to hijack it. A) Yeah, that can be done (I added a skipping mechanism). I can also easily remove the deduplication mechanism. B) I'm already fetching trips instead of objects.
CyteBode 9 days ago
New bounty created. But pls ensure we can have switches for what to call etc and something Ahmedangu can call programatically with the app he will wrap around this.
sebmack 9 days ago
Also pls bear in mind I'm a total amateur. A slightly easier readme for the uninformed would be great as it took me several go's to get your latest version to output. Tks.
sebmack 9 days ago
Thanks for awarding me this bounty. I'll do my best to make the new solution user-friendly. With that said, you should just tell me what may confuse you. I'm still left in the dark as to what I need to make clearer.
CyteBode 9 days ago
Ahmedangu - do you want me just to tip for an app when ready, or set up a new bounty? How much?
sebmack 9 days ago
do you want me just to tip for an app when ready, or set up a new bounty? whatever works with you. How much? i don't really like to talk about money that's why i didn't answer before .. whatever amount is comfortable with you would be okay with me $25 or $50 or FREE all okay :)
ahmedengu 9 days ago
Did you try https://openflights.org/ ? i think it's a good solution it support importing from tripit and exporting to csv and the csv contain these columns : Date,From,To,Flight_Number,Airline,Distance,Duration,Seat,Seat_Type,Class,Reason,Plane,Registration,Trip,Note,From_OID,To_OID,Airline_OID,Plane_OID also this list of apps can be helpful https://www.tripit.com/uhp/tools Let me know if my app still useful for you
ahmedengu 9 days ago
Hi. The problem with openflights (which I've used in the interim) is it doesn't have arrival times. Super annoying as inbound that's critical...all this effort basically because it doesn't have what Skyhops had!
sebmack 9 days ago
I am good to tip for an app but Cytebode still working on it...let's see how it all comes together. Thanks btw...great coding!!
sebmack 9 days ago
Sounds good, You are most welcome
ahmedengu 8 days ago