Tasker-Core Logging Standards
Version: 1.0 Last Updated: 2025-10-07 Status: Active Related: Observability Standardization
Table of Contents
- Philosophy
- Log Levels
- Structured Fields
- Message Style
- Instrument Macro
- Error Handling
- Examples
- Enforcement
Philosophy
Principles:
- Production-First: Logs must be parseable, searchable, and professional
- Correlation-Driven: All operations include correlation_id for distributed tracing
- Structured: Fields over string interpolation for aggregation and querying
- Concise: Clear, actionable messages without noise
- Consistent: Predictable patterns across all code
Anti-Patterns to Avoid:
- ❌ Emojis (🚀✅❌) - Breaks log parsers, unprofessional
- ❌ All-caps prefixes (“BOOTSTRAP:”, “CORE:”) - Redundant with module paths
- ❌ Ticket references (“JIRA-123”, “PROJ-40”) - Internal, meaningless externally
- ❌ String interpolation - Use structured fields instead
- ❌ Verbose messages - Be concise, let fields provide detail
Log Levels
ERROR - Unrecoverable Failures
When to Use:
- Database connection permanently lost
- Critical system component failure
- Unrecoverable state machine violation
- Data corruption detected
- Message queue unavailable
Characteristics:
- Requires immediate human intervention
- Service degradation or outage
- Cannot automatically recover
- Should trigger alerts/pages
Example:
#![allow(unused)]
fn main() {
error!(
correlation_id = %correlation_id,
task_uuid = %task_uuid,
error = %e,
"Failed to claim task for finalization: database unavailable"
);
}
WARN - Degraded Operation
When to Use:
- Retryable failures after exhausting retries
- Circuit breaker opened (degraded mode)
- Fallback behavior activated
- Rate limiting engaged
- Configuration issues (non-fatal)
- Unexpected but handled conditions
Characteristics:
- Service continues but degraded
- Automatic recovery possible
- Should be monitored for patterns
- May indicate upstream problems
Example:
#![allow(unused)]
fn main() {
warn!(
correlation_id = %correlation_id,
step_uuid = %step_uuid,
retry_count = attempts,
max_retries = max_attempts,
next_retry_at = ?next_retry,
"Step execution failed after max retries, will not retry further"
);
}
INFO - Lifecycle Events
When to Use:
- System startup/shutdown
- Task created/completed/failed
- Step enqueued/completed
- State transitions (task/step)
- Configuration loaded
- Significant business events
Characteristics:
- Normal operation milestones
- Useful for understanding flow
- Production-ready verbosity
- Default log level in production
Example:
#![allow(unused)]
fn main() {
info!(
correlation_id = %correlation_id,
task_uuid = %task_uuid,
steps_enqueued = count,
duration_ms = elapsed.as_millis(),
"Task initialization complete"
);
}
DEBUG - Detailed Diagnostics
When to Use:
- Discovery query results
- Queue depth checks
- Dependency analysis details
- Configuration value dumps
- State machine transition details
- Detailed operation flow
Characteristics:
- Troubleshooting information
- Not shown in production (usually)
- Safe to be verbose
- Helps understand “why”
Example:
#![allow(unused)]
fn main() {
debug!(
correlation_id = %correlation_id,
task_uuid = %task_uuid,
viable_steps = steps.len(),
pending_steps = pending.len(),
blocked_steps = blocked.len(),
"Step readiness analysis complete"
);
}
TRACE - Very Verbose
When to Use:
- Function entry/exit in hot paths
- Loop iteration details
- Deep parameter inspection
- Performance profiling hooks
Characteristics:
- Extremely verbose
- Usually disabled even in dev
- Performance impact acceptable
- Use sparingly
Example:
#![allow(unused)]
fn main() {
trace!(
correlation_id = %correlation_id,
iteration = i,
"Polling loop iteration"
);
}
Structured Fields
Required Fields (Context-Dependent)
Always Include:
#![allow(unused)]
fn main() {
correlation_id = %correlation_id, // ALWAYS when available
}
When Task Context Available:
#![allow(unused)]
fn main() {
correlation_id = %correlation_id,
task_uuid = %task_uuid,
namespace = %namespace,
}
When Step Context Available:
#![allow(unused)]
fn main() {
correlation_id = %correlation_id,
task_uuid = %task_uuid,
step_uuid = %step_uuid,
namespace = %namespace,
}
For Operations:
#![allow(unused)]
fn main() {
correlation_id = %correlation_id,
// ... entity IDs ...
operation = "step_enqueue", // Operation identifier
duration_ms = elapsed.as_millis(), // Timing for operations
}
For Errors:
#![allow(unused)]
fn main() {
correlation_id = %correlation_id,
// ... entity IDs ...
error = %e, // Error Display
error_type = %type_name::<E>(), // Optional: Error type
}
Field Ordering (MANDATORY)
Standard Order:
- correlation_id (always first)
- Entity IDs (task_uuid, step_uuid, namespace)
- Operation/Action (operation, state, status)
- Measurements (duration_ms, count, size)
- Error Info (error, error_type, context)
- Other Context (additional fields)
Example:
#![allow(unused)]
fn main() {
info!(
// 1. Correlation ID (ALWAYS FIRST)
correlation_id = %correlation_id,
// 2. Entity IDs
task_uuid = %task_uuid,
step_uuid = %step_uuid,
namespace = %namespace,
// 3. Operation
operation = "step_transition",
from_state = %old_state,
to_state = %new_state,
// 4. Measurements
duration_ms = elapsed.as_millis(),
// 5. No errors (success case)
// 6. Other context
processor_id = %processor_uuid,
"Step state transition complete"
);
}
Field Formatting
Use Display Formatting (%):
#![allow(unused)]
fn main() {
// ✅ CORRECT: Let tracing handle formatting
correlation_id = %correlation_id,
task_uuid = %task_uuid,
error = %e,
}
Avoid Manual Conversion:
#![allow(unused)]
fn main() {
// ❌ WRONG: Manual to_string()
task_uuid = task_uuid.to_string(),
// ❌ WRONG: Debug formatting for production types
task_uuid = ?task_uuid, // Use ? only for Debug types
}
Field Naming:
#![allow(unused)]
fn main() {
// ✅ Standard names
duration_ms // Not elapsed_ms, time_ms
error // Not err, error_message
step_uuid // Not workflow_step_uuid (be consistent)
retry_count // Not attempts, retries
max_retries // Not max_attempts
}
Message Style
Guidelines
DO:
- ✅ Be concise and actionable
- ✅ Use present tense for states: “Step enqueued”
- ✅ Use past tense for events: “Task completed”
- ✅ Start with the subject: “Task completed” not “Successfully completed task”
- ✅ Focus on WHAT happened (fields show HOW)
DON’T:
- ❌ Use emojis: “🚀 Starting…” → “Starting orchestration system”
- ❌ Use all-caps prefixes: “BOOTSTRAP: Starting…” → “Starting orchestration bootstrap”
- ❌ Include ticket numbers: “PROJ-40: Processing…” → “Processing command”
- ❌ Be redundant: “Successfully enqueued step successfully” → “Step enqueued”
- ❌ Include technical jargon: “Atomic CAS transition succeeded” → “State transition complete”
- ❌ Be verbose: Keep messages under 10 words ideally
Before/After Examples
Lifecycle Events:
#![allow(unused)]
fn main() {
// ❌ BEFORE
info!("🚀 BOOTSTRAP: Starting unified orchestration system bootstrap");
// ✅ AFTER
info!("Starting orchestration system bootstrap");
}
Operation Completion:
#![allow(unused)]
fn main() {
// ❌ BEFORE
info!("✅ STEP_ENQUEUER: Successfully marked step {} as enqueued", step_uuid);
// ✅ AFTER
info!(
correlation_id = %correlation_id,
step_uuid = %step_uuid,
"Step marked as enqueued"
);
}
Error Handling:
#![allow(unused)]
fn main() {
// ❌ BEFORE
error!("❌ ORCHESTRATION_LOOP: Failed to process task {}: {}", task_uuid, e);
// ✅ AFTER
error!(
correlation_id = %correlation_id,
task_uuid = %task_uuid,
error = %e,
"Task processing failed"
);
}
Shutdown:
#![allow(unused)]
fn main() {
// ❌ BEFORE
info!("🛑 Shutdown signal received, initiating graceful shutdown...");
// ✅ AFTER
info!("Shutdown signal received, initiating graceful shutdown");
}
Instrument Macro
When to Use
Use #[instrument] for:
- Function-level spans in hot paths
- Automatic correlation ID tracking
- Operations that should appear in traces
- Functions with significant duration
Benefits:
- Automatic span creation
- Automatic timing
- Better OpenTelemetry integration (Phase 2)
- Cleaner code
Example
#![allow(unused)]
fn main() {
use tracing::instrument;
#[instrument(skip(self), fields(
correlation_id = %correlation_id,
task_uuid = %task_uuid,
namespace = %namespace
))]
pub async fn process_task(
&self,
correlation_id: Uuid,
task_uuid: Uuid,
namespace: String,
) -> Result<TaskResult> {
// Span automatically created with fields above
info!("Starting task processing");
// ... implementation ...
info!(
duration_ms = start.elapsed().as_millis(),
"Task processing complete"
);
Ok(result)
}
}
Skip Parameters
Always skip:
self(redundant)- Large structures (use specific fields instead)
- Sensitive data (passwords, tokens, PII)
#![allow(unused)]
fn main() {
#[instrument(
skip(self, context), // Skip large context
fields(
correlation_id = %correlation_id,
task_uuid = %context.task_uuid, // Extract specific fields
)
)]
}
Error Handling
Error Context
Always include:
#![allow(unused)]
fn main() {
error!(
correlation_id = %correlation_id,
task_uuid = %task_uuid,
error = %e, // Error Display (user-friendly)
error_type = %type_name::<E>(), // Optional: For classification
"Operation failed"
);
}
Error Propagation
#![allow(unused)]
fn main() {
// ✅ Log and return for caller to handle
debug!(
correlation_id = %correlation_id,
task_uuid = %task_uuid,
error = %e,
"Step discovery query failed, will retry"
);
return Err(e);
// ❌ Don't log at every level (causes noise)
// Instead: Log once at appropriate level where handled
}
Error Classification
#![allow(unused)]
fn main() {
match result {
Err(e) if e.is_retryable() => {
warn!(
correlation_id = %correlation_id,
error = %e,
retry_count = attempts,
"Operation failed, will retry"
);
}
Err(e) => {
error!(
correlation_id = %correlation_id,
error = %e,
"Operation failed permanently"
);
}
Ok(result) => {
info!(
correlation_id = %correlation_id,
duration_ms = elapsed.as_millis(),
"Operation completed successfully"
);
}
}
}
Examples
Complete Examples by Scenario
Task Initialization
#![allow(unused)]
fn main() {
#[instrument(skip(self), fields(
correlation_id = %task_request.correlation_id,
task_name = %task_request.name,
namespace = %task_request.namespace
))]
pub async fn create_task_from_request(
&self,
task_request: TaskRequest,
) -> Result<TaskInitializationResult> {
let correlation_id = task_request.correlation_id;
let start = Instant::now();
info!("Starting task initialization");
// Create task
let task = self.create_task(&task_request).await?;
debug!(
task_uuid = %task.task_uuid,
template_uuid = %task.named_task_uuid,
"Task created in database"
);
// Discover steps
let steps = self.discover_initial_steps(task.task_uuid).await?;
info!(
correlation_id = %correlation_id,
task_uuid = %task.task_uuid,
step_count = steps.len(),
duration_ms = start.elapsed().as_millis(),
"Task initialization complete"
);
Ok(TaskInitializationResult {
task_uuid: task.task_uuid,
step_count: steps.len(),
})
}
}
Step Enqueueing
#![allow(unused)]
fn main() {
pub async fn enqueue_step(
&self,
correlation_id: Uuid,
task_uuid: Uuid,
step: &ViableStep,
) -> Result<()> {
let start = Instant::now();
debug!(
correlation_id = %correlation_id,
task_uuid = %task_uuid,
step_uuid = %step.step_uuid,
step_name = %step.name,
queue = %step.queue_name,
"Enqueueing step"
);
let message = self.create_message(correlation_id, task_uuid, step)?;
self.pgmq_client
.send(&step.queue_name, &message)
.await?;
info!(
correlation_id = %correlation_id,
task_uuid = %task_uuid,
step_uuid = %step.step_uuid,
queue = %step.queue_name,
duration_ms = start.elapsed().as_millis(),
"Step enqueued"
);
Ok(())
}
}
Error Handling
#![allow(unused)]
fn main() {
match self.process_step_result(result).await {
Ok(()) => {
info!(
correlation_id = %result.correlation_id,
task_uuid = %result.task_uuid,
step_uuid = %result.step_uuid,
duration_ms = elapsed.as_millis(),
"Step result processed"
);
}
Err(e) if e.is_retryable() => {
warn!(
correlation_id = %result.correlation_id,
task_uuid = %result.task_uuid,
step_uuid = %result.step_uuid,
error = %e,
retry_count = result.attempts,
"Step result processing failed, will retry"
);
return Err(e);
}
Err(e) => {
error!(
correlation_id = %result.correlation_id,
task_uuid = %result.task_uuid,
step_uuid = %result.step_uuid,
error = %e,
"Step result processing failed permanently"
);
return Err(e);
}
}
}
Bootstrap/Shutdown
#![allow(unused)]
fn main() {
pub async fn bootstrap() -> Result<OrchestrationSystemHandle> {
info!("Starting orchestration system bootstrap");
let config = ConfigManager::load()?;
debug!(environment = %config.environment, "Configuration loaded");
let context = SystemContext::from_config(config).await?;
info!(processor_uuid = %context.processor_uuid, "System context initialized");
let core = OrchestrationCore::new(context).await?;
info!("Orchestration core initialized");
// ... more initialization ...
info!(
processor_uuid = %core.processor_uuid,
namespaces = ?core.supported_namespaces,
"Orchestration system bootstrap complete"
);
Ok(handle)
}
pub async fn shutdown(&mut self) -> Result<()> {
info!("Initiating graceful shutdown");
if let Some(coordinator) = &self.event_coordinator {
coordinator.stop().await?;
debug!("Event coordinator stopped");
}
info!("Orchestration system shutdown complete");
Ok(())
}
}
Enforcement
Code Review Checklist
Before merging, verify:
- No emojis in log messages
- No all-caps component prefixes
- No ticket references in runtime logs
- correlation_id present in all task/step operations
- Structured fields follow standard ordering
- Messages are concise and actionable
- Appropriate log levels used
- Error context is complete
CI Checks
Recommended lints (future):
# Check for emojis
! grep -r '[🔧✅🚀❌⚠️📊🔍🎉🛡️⏱️📝🏗️🎯🔄💡📦🧪🌉🔌⏳🛑]' src/
# Check for all-caps prefixes
! grep -rE '(info|debug|warn|error)!\(".*[A-Z_]{3,}:' src/
# Check for ticket references in logs (allow in comments)
! grep -rE '(info|debug|warn|error)!.*[A-Z]+-[0-9]+' src/
Pre-commit Hook
Add to .git/hooks/pre-commit:
#!/bin/bash
./scripts/audit-logging.sh --check || {
echo "❌ Logging standards violation detected"
echo "Run: ./scripts/audit-logging.sh for details"
exit 1
}
Migration Guide
For Existing Code
- Remove emojis: Use find/replace
- Remove all-caps prefixes: Simple cleanup
- Add correlation_id: Extract from context
- Reorder fields: correlation_id first
- Shorten messages: Remove redundancy
- Verify log levels: Lifecycle = INFO, diagnostics = DEBUG
For New Code
- Always include correlation_id when context available
- Use
#[instrument]for significant functions - Follow field ordering: correlation_id, IDs, operation, measurements, errors
- Keep messages concise: Under 10 words
- Choose appropriate level: ERROR (fatal), WARN (degraded), INFO (lifecycle), DEBUG (diagnostic)
FAQ
Q: Should I use info! or debug! for step enqueueing?
A: info! - It’s a significant lifecycle event even if frequent.
Q: When should I add duration_ms?
A: For any operation that:
- Calls external systems (DB, queue)
- Is in the hot path
- Takes >10ms typically
- Needs performance monitoring
Q: Can I use emojis in error messages? A: No. Never use emojis in any log message. They break parsers and are unprofessional.
Q: Should correlation_id really always be first? A: Yes. This enables easy correlation across all logs. It’s the #1 most important field for distributed tracing.
Q: What about ticket references in module docs? A: Acceptable in module-level documentation for architectural context. Remove from runtime logs and inline comments.
Q: Can I include stack traces in logs?
A: Use error = %e which includes the error chain. Only add explicit backtrace for truly exceptional cases.
References
Document End
This is a living document. Propose changes via PR with rationale.