Exporting Data#

This vignette provides an in-depth guide to exporting event data from Teamworks AMS using teamworksams, focusing on the get_event_data() and sync_event_data() functions. It covers practical workflows for retrieving event data within a date range, synchronizing updated events, and applying filters and options to customize outputs. This guide equips you to efficiently export AMS event data for analysis or reporting. See Functions and Classes for detailed function documentation and Importing Data for additional data workflows.

Overview#

teamworksams simplifies exporting AMS event data, offering two key functions:

  • get_event_data(): Retrieves event data from an AMS Event Form (e.g., ‘Training Log’) for a specified date range, with optional filtering by user attributes or data fields and support for downloading attachments.

  • sync_event_data(): Fetches events inserted or updated since a synchronization time, ideal for incremental data updates, with options to include user metadata or UUIDs.

These functions return pandas.DataFrame objects, enabling seamless integration with Python data analysis tools like pandas. Filters (EventFilter(), SyncEventFilter()) and options (EventOption(), SyncEventOption()) provide fine-grained control over data retrieval. All examples use the placeholder URL https://example.smartabase.com/site, username username, and password password, managed via a .env file.

Prerequisites#

Ensure teamworksams is installed and credentials are configured, as described in Getting Started with teamworksams. Use a .env file for secure credential management:

.env#
AMS_URL=https://example.smartabase.com/site
AMS_USERNAME=username
AMS_PASSWORD=password

Load credentials:

from dotenv import load_dotenv
load_dotenv()

Required dependencies (installed with teamworksams): pandas, requests, python-dotenv, tqdm. See Managing Credentials, Authentication, and Caching for alternative credential methods (os, direct arguments, keyring).

Exporting Event Data#

Use get_event_data() to retrieve event data from an AMS Event Form within a date range, optionally filtering by user attributes or data fields and customizing outputs with EventOption().

Basic Usage

Fetch all events from the ‘Training Log’ form for January 2025:

from teamworksams import get_event_data
df = get_event_data(
    form = "Training Log",
    start_date = "01/01/2025",
    end_date = "31/01/2025",
    url = "https://example.smartabase.com/site",
    option = EventOption(interactive_mode = True)
)
print(df.head())

Output:

ℹ Requesting event data for 'Training Log' between 01/01/2025 and 31/01/2025
✔ Retrieved 10 event records for form 'Training Log'.
   about  user_id  event_id  form         start_date  duration  intensity
     0  John Doe    12345    67890  Training Log  01/01/2025        60       High
     1  Jane Smith  12346    67891  Training Log  02/01/2025        45     Medium
...

Filtering Events

Use EventFilter() to narrow results, e.g., events for users in ‘TeamA’ with ‘intensity’ equal to ‘High’:

from teamworksams import EventFilter, EventOption
df = get_event_data(
    form = "Training Log",
    start_date = "01/01/2025",
    end_date = "31/01/2025",
    url = "https://example.smartabase.com/site",
    filter = EventFilter(
        user_key = "group",
        user_value = "TeamA",
        data_key = "intensity",
        data_value = "High",
        data_condition = "equals"
    ),
    option = EventOption(interactive_mode = True, clean_names = True)
)
print(df[['user_id', 'intensity']].head())

Output:

ℹ Requesting event data for 'Training Log' between 01/01/2025 and 31/01/2025
✔ Retrieved 5 event records for form 'Training Log'.
   user_id intensity
0    12345      High
1    12347      High
...

Downloading Attachments

Download event attachments to a specified directory - either a specified absolute or relative path. If None, attachments will be saved in the current working directory.

df = get_event_data(
    form = "Training Log",
    start_date = "01/01/2025",
    end_date = "31/01/2025",
    url = "https://example.smartabase.com/site",
    option = EventOption(
        interactive_mode = True,
        download_attachment = True,
        attachment_directory = "./attachments"
    )
)
print(df.columns)

Output:

ℹ Requesting event data for 'Training Log' between 01/01/2025 and 31/01/2025
✔ Retrieved 10 event records for form 'Training Log'.
Index(['about', 'user_id', 'event_id', 'form', 'start_date', 'attachment_path'], ...)

See get_event_data(), EventFilter(), and EventOption() for details.

Synchronizing Event Data#

Use sync_event_data() to fetch events inserted or updated since a synchronization time, ideal for incremental updates. Returns a pandas.DataFrame and a new synchronization time.

Basic Usage

Synchronize events from ‘Training Log’ since March 1, 2023:

from teamworksams import sync_event_data, SyncEventOption
df, new_sync_time = sync_event_data(
    form = "Training Log",
    last_synchronisation_time = 1677654321000,  # 2023-03-01 12:25:21
    url = "https://example.smartabase.com/site",
    option = SyncEventOption(interactive_mode = True)
)
print(df[['user_id', 'start_date']].head())
print(f"New sync time: {new_sync_time}")

Output:

ℹ Requesting event data for 'Training Log' since 2023-03-01 12:25:21
✔ Retrieved 5 event records for 'Training Log' since 2023-03-01 12:25:21
   user_id start_date
0    12345 2025-03-01
...
New sync time: 1677654400000

Now, the next time you synchronise with that form, you can use the newly acquired ‘new_sync_time’ value to find any records that have been inserted/updated since you last called sync_event_data():

df, new_sync_time = sync_event_data(
    form = "Training Log",
    last_synchronisation_time = new_sync_time,
    url = "https://example.smartabase.com/site",
    option = SyncEventOption(interactive_mode = True)
)

By querying which records have been inserted/updated in a form beyond a certain time, you can set up workflows that automatically trigger Python scripts whenever new data is detected.

Filtering Synchronized Events

Synchronize events for users in ‘TeamA’:

from teamworksams import SyncEventFilter
df, new_sync_time = sync_event_data(
    form = "Training Log",
    last_synchronisation_time = 1677654321000,
    url = "https://example.smartabase.com/site",
    filter = SyncEventFilter(user_key = "group", user_value = "TeamA"),
    option = SyncEventOption(interactive_mode = True, include_user_data = True)
)
print(df[['about', 'start_date']].head())

Output:

ℹ Requesting event data for 'Training Log' since 2023-03-01 12:25:21
✔ Retrieved 3 event records for 'Training Log' since 2023-03-01 12:25:21
   about      start_date
0  John Doe  2025-03-01
...

Handling Deleted Events

Access deleted event IDs from the DataFrame’s attributes:

deleted_ids = df.attrs['deleted_event_ids']
print(f"Deleted event IDs: {deleted_ids}")

Output:

Deleted event IDs: [67888, 67889]

See sync_event_data(), SyncEventFilter(), and SyncEventOption() for details.

Options and Usage Notes#

This section provides detailed guidance on using option classes (EventOption(), SyncEventOption()) and filter classes (EventFilter(), SyncEventFilter()) to customize export operations, along with key usage notes for date/time handling, caching, column types, interactive mode, and attachment storage.

Option Classes

Each export function requires a specific option class to configure its behavior. These classes must be instantiated with parameters like interactive_mode, cache, and others. For example, to disable column type casting in get_event_data():

from teamworksams import get_event_data, EventOption
df = get_event_data(
    form = "Training Log",
    start_date = "01/01/2025",
    end_date = "31/01/2025",
    url = "https://example.smartabase.com/site",
    option = EventOption(guess_col_type = False)
)

The option classes and their associated functions are:

Available parameters for EventOption() (and similar for SyncEventOption(), except where noted):

  • interactive_mode (bool): If True, displays status messages (e.g., “Retrieved 10 event records”) and tqdm progress bars, ideal for interactive environments like Jupyter notebooks. Set to False for silent execution in automated pipelines. Defaults to True. Example:

    option = EventOption(interactive_mode = False)
    df = get_event_data(..., option = option)  # No output, suitable for scripts
    
  • cache (bool): If True, reuses an existing AMSClient() via get_client(), reducing API calls for authentication or data retrieval. Set to False for fresh data, increasing API overhead. Defaults to True. See “Caching” below.

  • guess_col_type (bool): If True, attempts to cast DataFrame columns to appropriate types (e.g., numeric for ‘duration’, datetime for ‘start_date’). If False, all columns are returned as strings, useful for consistent string handling. Defaults to True. Example:

    option = EventOption(cast_to_type = False)
    df = get_event_data(..., option = option)  # All columns as strings
    
  • download_attachment (bool): If True, downloads event attachments (e.g., PDFs, images) to attachment_directory. Adds columns like ‘attachment_path’ to the DataFrame. Defaults to False. Example:

    option = EventOption(download_attachment = True, attachment_directory = "./files")
    df = get_event_data(..., option = option)
    
  • attachment_directory (Optional[str]): Directory path for saving attachments (e.g., “./files”). Must be valid if download_attachment=True. Defaults to None. See “Attachments” below.

  • clean_names (bool): If True, converts column names to snake_case (e.g., ‘Session RPE’ to ‘session_rpe’), improving compatibility with Python. Defaults to False. Example:

    option = EventOption(clean_names = True)
    df = get_event_data(..., option = option)  # Columns like 'session_rpe'
    
  • include_user_data (bool): Only for SyncEventOption(). If True, includes user metadata (e.g., ‘about’, ‘email’) in the DataFrame. Defaults to False. Example:

    option = SyncEventOption(include_user_data = True)
    df, _ = sync_event_data(..., option = option)  # Includes 'about' column
    
  • include_uuid (bool): Only for SyncEventOption(). If True, includes user UUIDs in the DataFrame. Defaults to False. Example:

    option = SyncEventOption(include_uuid = True)
    df, _ = sync_event_data(..., option = option)  # Includes 'uuid' column
    

Filter Classes

Filters narrow data retrieval for efficiency. Use EventFilter() for get_event_data() and SyncEventFilter() for sync_event_data(). For example, to filter by email in get_event_data():

from teamworksams import EventFilter
df = get_event_data(
    form = "Training Log",
    start_date = "01/01/2025",
    end_date = "31/01/2025",
    url = "https://example.smartabase.com/site",
    filter = EventFilter(user_key = "email", user_value = "john.doe@example.com")
)

Available parameters for EventFilter() (and similar for SyncEventFilter()):

  • user_key (str): User identification method. Must be one of ‘user_id’, ‘username’, ‘email’, ‘about’, ‘group’, or ‘current_group’. Specifies how to filter users. Example:

    filter = EventFilter(user_key = "group", user_value = "TeamA")
    df = get_event_data(..., filter = filter)  # Events for 'TeamA'
    
  • user_value (Union[str, List[str]]): Value(s) for user_key. For ‘group’, specify a group name (e.g., “TeamA”). For ‘current_group’, user_value is ignored. Example:

    filter = EventFilter(user_key = "email", user_value = ["john.doe@example.com"])
    df = get_event_data(..., filter = filter)
    
  • data_key (Optional[str]): Field name in the form (e.g., ‘duration’) to filter data. Only for EventFilter(). Example:

    filter = EventFilter(data_key = "duration", data_value = "60")
    df = get_event_data(..., filter = filter)
    
  • data_value (Optional[str]): Value for data_key (e.g., “60”). Only for EventFilter().

  • data_condition (str): Condition for data_key/data_value. Options: ‘equals’, ‘not_equals’, ‘greater_than’, ‘less_than’, ‘greater_than_or_equals’, ‘less_than_or_equals’, ‘contains’. Numeric conditions (e.g., ‘greater_than’) apply to numeric fields only. Defaults to ‘equals’. Example:

    filter = EventFilter(
        data_key = "duration",
        data_value = "60",
        data_condition = "greater_than"
    )
    df = get_event_data(..., filter = filter)
    

Valid user_key values and their user_value requirements:

  • user_id: AMS-generated user IDs (e.g., [12345, 12346]).

  • username: AMS usernames (e.g., [“john.doe”, “jane.smith”]).

  • email: User emails (e.g., [”john.doe@example.com”]).

  • about: Full names (e.g., [“John Doe”]).

  • group: Group name (e.g., “TeamA”). Use get_group() to list groups.

  • current_group: Uses the group loaded during the last AMS login; user_value is ignored.

Multiple filters example:

filter = EventFilter(
    user_key = "email",
    user_value = ["john.doe@example.com", "jane.smith@example.com"],
    data_key = ["duration", "rpe"],
    data_value = ["60", "7"],
    data_condition = ["greater_than", "equals"]
)
df = get_event_data(..., filter = filter)  # Duration > 60 and RPE = 7

Note: All filter conditions must be met (logical AND). Ensure data_key, data_value, and data_condition have equal lengths.

Caching

When option.cache=True (default), export functions reuse an existing AMSClient() created by get_client(), maintaining an authenticated session and reducing API calls for login or data retrieval. For example:

option = EventOption(cache = True)
df1 = get_event_data(form = "Training Log", ..., option = option)
df2 = get_event_data(form = "Wellness", ..., option = option)  # Reuses client

Set cache=False for independent sessions, ensuring fresh data but increasing API overhead.

Date and Time Handling

The AMS API requires specific formats for date and time parameters:

  • start_date, end_date: Must be DD/MM/YYYY (e.g., “01/01/2025”) for get_event_data(). Both are required and must be valid dates, with end_date not before start_date. Example:

    df = get_event_data(start_date = "01/01/2025", end_date = "31/01/2025", ...)
    
  • time_range: Optional for get_event_data(), a tuple of h:mm AM/PM times (e.g., ("12:00 AM", "11:59 PM")). Defaults to full day if unset. Example:

    df = get_event_data(
        form = "Training Log",
        start_date = "01/01/2025",
        end_date = "01/01/2025",
        time_range = ("9:00 AM", "5:00 PM"),
        url = "..."
    )
    
  • last_synchronisation_time: Required for sync_event_data(), a Unix timestamp in milliseconds (e.g., 1677654321000 for 2023-03-01 12:25:21 UTC). Use the returned new_sync_time for subsequent calls. Example:

    df, new_sync_time = sync_event_data(
        last_synchronisation_time = 1677654321000,
        ...
    )
    

Column Types

By default, cast_to_type=True converts DataFrame columns to appropriate types (e.g., numeric for ‘duration’, datetime for ‘start_date’). Set cast_to_type=False to return all columns as strings, ensuring consistency:

option = EventOption(cast_to_type = False)
df = get_event_data(..., option = option)
print(df.dtypes)  # All columns as object (string)

Metadata columns (user_id, event_id, entered_by_user_id) are always numeric, and event dates (start_date, end_date) are strings, as required for AMS imports.

Interactive Mode

When interactive_mode=True (default), export functions display progress messages (e.g., “ℹ Requesting event data”) and tqdm progress bars, enhancing feedback in interactive environments. Set interactive_mode=False for silent execution in automated pipelines:

option = SyncEventOption(interactive_mode = False)
df, _ = sync_event_data(..., option = option)  # No output

Attachments

When download_attachment=True, attachments are saved to attachment_directory (e.g., “./files”), with paths added to the DataFrame (e.g., ‘attachment_path’). Supported file types include pdf, png, jpg, etc. Ensure the directory exists and is writable. Example:

import os
os.makedirs("./files", exist_ok = True)
option = EventOption(download_attachment = True, attachment_directory = "./files")
df = get_event_data(..., option = option)
print(df['attachment_path'].head())  # Paths to saved files

Files are named with AMS-generated IDs (e.g., 12345_1747654002120.pdf).

Error Handling#

teamworksams functions provide descriptive AMSError messages with interactive feedback. For simple scripts, rely on these:

df = get_event_data(
    form = "Invalid Form",
    start_date = "01/01/2025",
    end_date = "31/01/2025",
    url = "https://example.smartabase.com/site",
    option = EventOption(interactive_mode = True)
)

Output:

✖ Failed to fetch events: No events found for form 'Invalid Form'...
AMSError: No events found for form 'Invalid Form'...

Best Practices#

  • Filtering: Use EventFilter() or SyncEventFilter() to limit data retrieval (e.g., user_key="group") for performance.

  • Caching: Enable option.cache=True to reuse a AMSClient() across multiple calls, reducing API overhead.

  • Attachments: Specify a valid attachment_directory when download_attachment=True to avoid file write errors.

  • Synchronization: Store the new_sync_time from sync_event_data() for subsequent calls to maintain incremental updates.

  • Data Validation: Ensure form matches an existing AMS form and dates are in ‘DD/MM/YYYY’ format to avoid ValueError.

Troubleshooting#

  • Invalid Form:

    AMSError: No events found for form 'Invalid Form'...
    

    Solution: Verify the form name with your AMS administrator.

  • Invalid Date Format:

    ValueError: start_date must be in 'DD/MM/YYYY' format...
    

    Solution: Use correct format:

    df = get_event_data(form = "Training Log", start_date = "01/01/2025", ...)
    
  • No Events Found:

    AMSError: No events found for form 'Training Log'...
    

    Solution: Check date range or filter settings:

    df = get_event_data(form = "Training Log", start_date = "01/01/2024", ...)
    

Next Steps#