Jump to: Things to Keep in Mind | Download ETL Script & Config File | Getting Started | Example JSON File | Start exports | Check export status | Resume exports | Display all S3 export links | Finishing the ETL Process
For many TUNE customers, it makes sense to pull log-level data into on a continuing basis, so that a historical copy of data is available for internal analysis. One way to accomplish this is to use the new v3 TUNE Logs API to extract data incrementally on a regular schedule, transforming and loading it into your own system as necessary.
Database developers may recognize this pattern of extraction, transformation, and loading as the classic ETL process.
Ultimately, the way that you implement ETL of TUNE Logs data will be dependent on the specific needs and requirements of your business. That said, as a means of getting you started, we’ve provided an example Python script in this article that demonstrates one way of doing ETL with TUNE's new v3 Log service—and we’ve included a few configuration parameters you can tweak to suit your needs (timezone, extraction interval, etc.)
Things to Keep in Mind
- We limit the exporting of log data to the last 120 days worth of data. For anything older than 120 days, please submit a request via our Request a Report Pull form.
- As of May 2018 , we will have a max retention period of 25 months for all log-level data. Therefore, we highly recommend pulling your log-level data on a regular basis to ensure continuous access to your historical data.
- This code is provided without warranty or support—it is not a TUNE product, but rather a starting point that any developer can use for writing their own ETL-style script for transferring Logs data from TUNE to your corporate marketing data warehouse.
Download ETL Script & Config File
To start modifying this code to suit your own purposes, the first step is to download the following Python script and configuration file. Then you can set it to run on a scheduled interval to perform batch exports for user-defined time increments and collect your S3 download links.
Finally, you'll need to use your favorite HTTP library to download the data behind the S3 links to transform and load it into your system as appropriate.
Getting Started
Update the configuration file
Tailor the script for your organization by updating export_report.json with the following information. Only include the report types that you need (clicks, postbacks, installs, etc):
{ "advertiser": { "id": ADVERTISER_ID, "api_key": "API_KEY", "timezone": "TIMEZONE_STRING" }, "reports": { "installs": { "url": "EXPORT_URI", "opt_interval_by_hour": HOUR_INTERVAL }, "clicks": { "url": "EXPORT_URI", "opt_interval_by_hour": HOUR_INTERVAL } ... } }
ADVERTISER_ID | Integer, your organization's advertiser id |
API_KEY | String, your organization's api key |
TIMEZONE_STRING | String, following ISO 8601 |
EXPORT_URI | String, the export call you wish to make. Omit the following parameters: start_date, end_date, limit, and api_key since these fields are filled in by the script. Limit is preset to the maximum number of records, 2 million. |
HOUR_INTERVAL | Integer or float representing the time increments you would like to pull data for. If 1 is provided, then exports will be pulled hourly between the start_date and end_date provided. Float values are also accepted, like 0.5 for half-hour increments. |
Example JSON File
{ "advertiser": { "id": 0000, "api_key": "XXXXXX", "timezone": "America/Los_Angeles" }, "reports": { "installs": { "url": "https://api.mobileapptracking.com/v3/logs/advertisers/0000/exports/installs?api_key=XXXXXX&fields=created", "opt_interval_by_hour": 6 }, "clicks": { "url": "https://api.mobileapptracking.com/v3/logs/advertisers/0000/exports/clicks?api_key=XXXXXX&fields=created", "opt_interval_by_hour": 24 }, "impressions": { "url": "https://api.mobileapptracking.com/v3/logs/advertisers/0000/exports/impressions?api_key=XXXXXX&fields=created" }, "event_items": { "url": "https://api.mobileapptracking.com/v3/logs/advertisers/0000/exports/event_items?api_key=XXXXXX&fields=created" } } }
Start exports
Once you have customized export_report.json, run the export script from the command line:
python export_report_jobs.py new export_report.json "YYYY-MM-DDTHH-MM-SS" "YYYY-MM-DDTHH-MM-SS"
Once the processes are complete, your export links can be found in the export_report.json.jobs file.
Check export status
As the new export process runs, you can check the status of the exports with
python export_report_jobs.py status export_report.json "YYYY-MM-DDTHH-MM-SS" "YYYY-MM-DDTHH-MM-SS"
Resume exports
If the new exports process is interrupted, use this command to resume exports:
python export_report_jobs.py resume export_report.json "YYYY-MM-DDTHH-MM-SS" "YYYY-MM-DDTHH-MM-SS"
Display all S3 export links
Retrieve all S3 download URLs for exports data
python export_report_jobs.py display export_report.json "YYYY-MM-DDTHH-MM-SS" "YYYY-MM-DDTHH-MM-SS"
S3 export download links have an expiration date of two weeks.
Finishing the ETL Process
Remember that TUNE's S3 links have a standard expiration of two weeks, so the script that gets the S3 links should be closely linked with whatever process is responsible for downloading the data and storing it in your system.