Mozilla Telemetry Data S3 Input Sandbox Bootstrapper
This sandbox:
- retrieves a list of files from S3 matching the dimension specification
in
dimension_file
- divides the list among the specified number of
partitions
- generates the cfgs to dynamically start a new reader for each partition;
the specified
input_plugin
should already be installed in run/input - exits when the bootstrapping is complete
1. Sample Configuration
filename = "moz_telemetry_s3_bootstrap.lua"
instruction_limit = 0
ticker_interval = 0
input_plugin = "telemetry_s3_snappy.lua"
input_plugin_cfgs = {} -- table of sandbox specific config options
tmp_dir = "/mnt/work/tmp"
s3_bucket = "net-mozaws-prod-us-west-2-pipeline-data"
s3_prefix = "telemetry-2"
dimension_file = "dimensions.json"
partitions = 8
source code: moz_telemetry_s3_bootstrap.lua