Data Model Reference (Enterprise Single Tenant)
Last updated 9 days ago
Introduction
If your organization is on an Enterprise Single Tenant plan, you have direct access to your analytics data in Google BigQuery. This gives you the flexibility to:
Build custom dashboards in your BI tool of choice (Looker, Tableau, Power BI, Metabase, etc.)
Run advanced SQL queries for deep-dive analysis
Export data for compliance, auditing, or data science workflows
Integrate allmates.ai data with your existing data warehouse
This guide provides a complete reference to the mates_analytics BigQuery dataset, including all tables, views, and relationships.
Architecture Overview
Data Flow
Firestore (Live Data) ↓ Eventarc triggers Cloud Functions (Change Capture) ↓ Streaming inserts all_raw_changelog (Raw Events) ↓ Hourly MERGE queries tbl_*_latest (Physical Tables) ↓ SQL Views dim_* (Dimensions) + fact_* (Facts) ↓ Your BI Tool / Custom QueriesKey Properties
Note: Your dedicated GCP project ID will be provided by allmates.ai during onboarding. Throughout this guide, we use
{YOUR_GCP_PROJECT}as a placeholder.
Getting Started
1. Access Requirements
To query the data, you need:
GCP Project ID — Provided by allmates.ai
IAM Role:
roles/bigquery.dataVieweron themates_analyticsdatasetIAM Role:
roles/bigquery.jobUseron the project to run queriesService Account Key (JSON) for programmatic access
Contact your allmates.ai account manager to request access credentials.
2. Connecting to BigQuery
From the BigQuery Console
Go to BigQuery Console
Select your project:
{YOUR_GCP_PROJECT}(provided by allmates.ai)Navigate to dataset:
mates_analyticsBrowse tables and views in the left sidebar
From a BI Tool
Most BI tools support BigQuery via:
Native connector (Looker, Tableau, Power BI, Metabase)
JDBC/ODBC driver (for other tools)
Connection settings:
Project ID:
{YOUR_GCP_PROJECT}(provided by allmates.ai)Dataset:
mates_analyticsLocation:
europe-west1Authentication: Service account JSON key (provided by allmates.ai)
3. Multi-Tenant Data Isolation
All data is scoped by organization_id.
⚠️ Critical: Always filter on organization_id in your queries to:
Avoid scanning data from other organizations (if you have multi-org access)
Leverage BigQuery clustering for performance
Respect data isolation
sql
SELECT * FROM `{YOUR_GCP_PROJECT}.mates_analytics.fact_message_tokens` WHERE organization_id = 'YOUR_ORG_ID' AND event_date >= CURRENT_DATE - 30How to find your organization_id:
Check the URL in the allmates.ai app:
app.allmates.ai/o/{organization_id}/...Or query:
SELECT organization_id, display_name FROM dim_organization
Data Model Structure
The mates_analytics dataset contains three types of objects:
1. Physical Tables (tbl_*_latest)
18 tables that store the latest state of each entity from Firestore.
Refreshed hourly via MERGE operations
Contain raw data with minimal transformations
Include audit columns:
last_synced_at,last_operation,firestore_path
Naming convention: tbl_{entity}_latest
Examples:
tbl_organization_latesttbl_message_latesttbl_agent_latest
2. Dimension Views (dim_*)
13 views that sit on top of physical tables and add:
Date derivations (e.g.,
created_date,updated_date)Extracted fields (e.g.,
connection_idfrom Firestore references)Cleaner, more stable interfaces for joins
Naming convention: dim_{entity}
Examples:
dim_organizationdim_memberdim_agent
Best practice: Always join facts to dimensions, not to physical tables.
3. Fact Views (fact_*)
9 views that contain business metrics and are the primary tables for analytics.
Each fact has a defined grain (what one row represents).
Naming convention: fact_{metric}
Examples:
fact_message_tokens— Token usage and cost per AI messagefact_conversation— Conversation-level aggregated metricsfact_tool_calls— Tool invocation events
Best practice: Start your queries from fact views for pre-computed metrics.
Key Tables Reference
Core Entities
Fact Tables
Entity Relationships
Hierarchy
Organization ├── Workspace │ └── Project (Conversation) │ ├── Message │ ├── Attachment │ └── Tool Call ├── Member (User in Organization) ├── Agent (Mate) ├── Model ├── Connection └── Tool InstanceKey Foreign Keys
Common Use Cases & Queries
Note: Replace
{YOUR_GCP_PROJECT}with your actual GCP project ID provided by allmates.ai.
1. Token Usage by Organization (Last 30 Days)
Use case: Track AI consumption and costs across organizations.
sql
SELECT o.display_name AS organization_name, COUNT(*) AS message_count, SUM(f.input_tokens) AS total_input_tokens, SUM(f.output_tokens) AS total_output_tokens, SUM(f.total_tokens) AS total_tokens, ROUND(SUM(f.estimated_cost_usd), 2) AS total_cost_usd FROM `{YOUR_GCP_PROJECT}.mates_analytics.fact_message_tokens` f JOIN `{YOUR_GCP_PROJECT}.mates_analytics.dim_organization` o USING (organization_id) WHERE f.event_date >= CURRENT_DATE - 30 GROUP BY o.display_name ORDER BY total_cost_usd DESC2. Daily Active Users
Use case: Track user engagement over time.
sql
SELECT event_date, COUNT(DISTINCT sender_member_id) AS active_users, COUNT(*) AS total_messages, ROUND(COUNT(*) / COUNT(DISTINCT sender_member_id), 1) AS messages_per_user FROM `{YOUR_GCP_PROJECT}.mates_analytics.fact_message` WHERE organization_id = 'YOUR_ORG_ID' AND sender_type = 'human' AND event_date >= CURRENT_DATE - 30 GROUP BY event_date ORDER BY event_date3. Most Active Conversations
Use case: Identify high-engagement conversations.
sql
SELECT project_id, conversation_name, total_message_count, unique_human_participants, unique_agents_solicited, TIMESTAMP_DIFF(last_message_at, first_message_at, HOUR) AS duration_hours, activity_status, engagement_category FROM `{YOUR_GCP_PROJECT}.mates_analytics.fact_conversation` WHERE organization_id = 'YOUR_ORG_ID' AND activity_status IN ('ACTIVE_NOW', 'ACTIVE_TODAY') ORDER BY total_message_count DESC LIMIT 204. Tool Performance Analysis
Use case: Monitor tool reliability and performance.
sql
SELECT tool_name, COUNT(*) AS call_count, COUNTIF(is_successful) AS success_count, ROUND(COUNTIF(is_successful) / COUNT(*) * 100, 1) AS success_rate_pct, ROUND(AVG(duration_seconds), 2) AS avg_duration_sec, ROUND(SUM(cost), 2) AS total_cost_usd FROM `{YOUR_GCP_PROJECT}.mates_analytics.fact_tool_calls` WHERE organization_id = 'YOUR_ORG_ID' AND event_date >= CURRENT_DATE - 7 GROUP BY tool_name ORDER BY call_count DESC5. Cost by LLM Model
Use case: Understand which models are driving costs.
sql
SELECT m.provider AS provider, m.label AS model_name, COUNT(*) AS message_count, SUM(f.total_tokens) AS total_tokens, ROUND(SUM(f.estimated_cost_usd), 2) AS total_cost_usd, ROUND(AVG(f.total_tokens), 0) AS avg_tokens_per_response FROM `{YOUR_GCP_PROJECT}.mates_analytics.fact_message_tokens` f JOIN `{YOUR_GCP_PROJECT}.mates_analytics.dim_model` m ON f.model_id = m.model_id WHERE f.organization_id = 'YOUR_ORG_ID' AND f.event_date >= CURRENT_DATE - 30 GROUP BY m.provider, m.label ORDER BY total_cost_usd DESC6. Multi-Agent Chain Analysis
Use case: Understand agent-to-agent collaboration patterns.
sql
SELECT project_id, COUNT(*) AS total_agent_messages, COUNTIF(is_agent_response) AS direct_responses, COUNTIF(is_in_agent_chain) AS chain_messages, MAX(messages_since_last_human) AS max_chain_depth, ROUND(AVG(messages_since_last_human), 1) AS avg_chain_depth FROM `{YOUR_GCP_PROJECT}.mates_analytics.fact_message` WHERE organization_id = 'YOUR_ORG_ID' AND event_date >= CURRENT_DATE - 7 AND sender_type = 'agent' GROUP BY project_id HAVING COUNT(*) > 10 ORDER BY max_chain_depth DESC LIMIT 207. File Upload Trends
Use case: Track file usage and storage consumption.
sql
SELECT event_date, COUNT(*) AS files_uploaded, ROUND(SUM(file_size_mb), 2) AS total_size_mb, ROUND(AVG(file_size_mb), 2) AS avg_file_size_mb, SUM(token_count) AS total_tokens_extracted FROM `{YOUR_GCP_PROJECT}.mates_analytics.fact_attachment` WHERE organization_id = 'YOUR_ORG_ID' AND event_date >= CURRENT_DATE - 30 GROUP BY event_date ORDER BY event_date8. Monthly Recurring Revenue (MRR)
Use case: Track subscription revenue.
sql
SELECT o.display_name AS organization_name, f.product_name, SUM(f.mrr_amount) AS total_mrr_usd, SUM(f.arr_amount) AS total_arr_usd, COUNT(*) AS active_subscriptions FROM `{YOUR_GCP_PROJECT}.mates_analytics.fact_subscription_snapshot` f JOIN `{YOUR_GCP_PROJECT}.mates_analytics.dim_organization` o USING (organization_id) WHERE f.is_active = TRUE GROUP BY o.display_name, f.product_name ORDER BY total_mrr_usd DESCPerformance Optimization Tips
1. Always Filter on organization_id
All tables are clustered on organization_id as the first clustering column. Always include it in your WHERE clause for optimal performance.
sql
-- ✅ Good: Uses clustering SELECT * FROM `{YOUR_GCP_PROJECT}.mates_analytics.fact_message_tokens` WHERE organization_id = 'YOUR_ORG_ID' AND event_date >= CURRENT_DATE - 7 -- ❌ Bad: Full table scan SELECT * FROM `{YOUR_GCP_PROJECT}.mates_analytics.fact_message_tokens` WHERE sender_display_name = 'Alice'2. Use Date Filters
Most fact tables include an event_date column. Always filter on date ranges to reduce data scanned.
sql
-- ✅ Good: Scans only 30 days WHERE event_date >= CURRENT_DATE - 30 -- ❌ Bad: Scans entire history WHERE event_timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)3. Query Fact Views, Not Physical Tables
Fact views pre-compute joins, categories, and metrics. They are optimized for analytics queries.
sql
-- ✅ Good: Use fact view SELECT * FROM `{YOUR_GCP_PROJECT}.mates_analytics.fact_message_tokens` WHERE organization_id = 'YOUR_ORG_ID' -- ❌ Bad: Manual joins SELECT m.*, mm.input_tokens, mm.output_tokens FROM `{YOUR_GCP_PROJECT}.mates_analytics.tbl_message_latest` m LEFT JOIN `{YOUR_GCP_PROJECT}.mates_analytics.tbl_message_metadata_latest` mm ON m.message_id = mm.message_id WHERE m.organization_id = 'YOUR_ORG_ID'4. Leverage Clustering Columns
Each table has specific clustering columns for optimal query performance.
Key clustering columns:
Example:
sql
-- ✅ Good: Uses both clustering columns SELECT * FROM `{YOUR_GCP_PROJECT}.mates_analytics.tbl_message_latest` WHERE organization_id = 'YOUR_ORG_ID' AND project_id = 'PROJECT_123' -- ⚠️ OK: Uses first clustering column only SELECT * FROM `{YOUR_GCP_PROJECT}.mates_analytics.tbl_message_latest` WHERE organization_id = 'YOUR_ORG_ID'5. Use LIMIT for Exploratory Queries
When exploring data, always use LIMIT to avoid scanning large datasets.
sql
-- ✅ Good: Limits results SELECT * FROM `{YOUR_GCP_PROJECT}.mates_analytics.fact_message_tokens` WHERE organization_id = 'YOUR_ORG_ID' LIMIT 100 -- ❌ Bad: Scans entire table SELECT * FROM `{YOUR_GCP_PROJECT}.mates_analytics.fact_message_tokens` WHERE organization_id = 'YOUR_ORG_ID'Data Freshness & Updates
Refresh Schedule
Physical tables (
tbl_*_latest): Updated every 1 hour via MERGE operationsDimension views (
dim_*): Real-time (views on top of physical tables)Fact views (
fact_*): Real-time (views on top of physical tables)
Data Latency
Typical latency: 1–2 hours from event occurrence to availability in BigQuery
Peak latency: Up to 3 hours during high-volume periods
Checking Last Update
Each physical table has a last_synced_at column showing when the row was last updated.
sql
SELECT organization_id, MAX(last_synced_at) AS last_update FROM `{YOUR_GCP_PROJECT}.mates_analytics.tbl_message_latest` WHERE organization_id = 'YOUR_ORG_ID' GROUP BY organization_idCost Estimation
Pricing Model
BigQuery charges based on:
Storage: $0.02 per GB per month (long-term storage: $0.01 per GB)
Queries: $5 per TB scanned
Typical Query Costs
Cost optimization tips:
Always filter on
organization_idandevent_dateUse
LIMITfor exploratory queriesAvoid
SELECT *— specify only needed columnsUse materialized views for frequently run queries
Appendix: Complete Table List
Physical Tables (18)
tbl_organization_latesttbl_workspace_latesttbl_project_latesttbl_user_latesttbl_member_latesttbl_agent_latesttbl_model_latesttbl_connection_latesttbl_tool_instance_latesttbl_message_latesttbl_message_metadata_latesttbl_chat_metadata_latesttbl_tools_call_latesttbl_attachment_latesttbl_polar_customer_latesttbl_polar_subscription_latesttbl_polar_meter_latesttbl_invitation_latest
Dimension Views (13)
dim_organizationdim_workspacedim_projectdim_userdim_memberdim_agentdim_modeldim_connectiondim_tool_instancedim_providerdim_customerdim_subscriptiondim_invitation
Fact Views (9)
fact_message_tokensfact_messagefact_conversationfact_tool_callsfact_chat_tokensfact_attachmentfact_subscription_snapshotfact_meter_usagefact_message_errors
Reference Tables (1)
ref_model_pricing