To Comment or Not To Comment

Navigating the Art and Science of Code Comments

Indrajeet Patil

René Magritte's 'The Treachery of Images' painting showing a pipe with text 'Ceci n'est pas une pipe' (This is not a pipe)

What You’ll Learn

  • Why commenting matters for code quality and maintainability
  • When to comment and when not to comment
  • Practical strategies for writing clear, helpful, and meaningful comments


🎯 Goal

Learning to write comments that clarify intent, not echo code.


Despite Python examples, all the mentioned strategies are language-agnostic.
None of this advice is dogma; there can be valid reasons and conventions to break these rules.

“The purpose of commenting is to help the reader know as much as the writer did.”

- Boswell & Foucher

Why Comments Matter

Good comments act as guideposts for navigating complex code.

Why It’s Challenging

Multiple pressures during development make thoughtful commenting difficult.

N.B. Documentation vs. Comments

def is_valid_email(email: str) -> bool:
    “““Check if email address is valid.
    
    Args:
        email: Email address to validate
    
    Returns:
        True if valid, False otherwise
    “““
    # RFC 5322 simplified: local@domain format
    local_part = r’[a-zA-Z0-9._%+-]+’
    domain = r’[a-zA-Z0-9.-]+’
    tld = r’[a-zA-Z]{2,}’
    
    pattern = rf‘^{local_part}@domain.{tld}$’
    return re.match(pattern, email) is not None

Documentation

  • How to use the function
  • Parameters & returns
  • For function users
  • For public interface

Comments

  • Why it works this way
  • Business rules & context
  • For code maintainers
  • For private implementation

What Not to Comment

Every comment distracts from reading code—make sure it’s worth the interruption.

Don’t use crutch comments

Fix the unclear code instead: \(good code > bad code + comments\).

❌ Comment compensates for bad naming

# Check if user can access premium features
def chk(u):
    return u.tier == 'gold' or u.tier == 'platinum'

✅ Clear naming needs no comment

def has_premium_access(user):
    return user.tier in ['gold', 'platinum']

❌ Comment explains complex conditional

# Process if positive and (even or divisible by 5)
if x > 0 and (x % 2 == 0 or x % 5 == 0):
    process(x)

✅ Well-named function obviates comment

def should_process(x):
    return x > 0 and (x % 2 == 0 or x % 5 == 0)

if should_process(x):
    process(x)

❌ Closing brace comments signal complexity

Indicates overly complex, deeply nested code that needs refactoring into smaller, well-named functions (example from JavaScript).

while (...) {
    try {
        for (...) {
            ...
        } // end for
    } catch {
        ...
    } // end try
} // end while

Don’t duplicate the code in comments

❌ Bad

# increment counter
counter += 1

# user name variable
user_name = "john"

✅ Good

# Reset retry count after successful connection
counter = 0

# Store normalised username for database
user_name = normalize(input)

Why not ‘Don’t state the obvious?’

Obvious depends on your audience. Code clear to senior developers may confuse juniors.

Consider team experience when deciding what needs explanation.

Don’t write non-local comments

❌ Requires checking other files

# See UserService for validation logic
user = create_user(data)

✅ Self-contained explanation

# Email validated before creation (RFC 5322)
# Duplicates rejected via unique constraint
user = create_user(data)

Don’t retain dead code as comments

❌ Bad

# old_function()
new_function()

✅ Good

new_function()

Don’t narrate git history

❌ Narrative changelog

# Changed timeout from 5 to 10 on Jan 15
# Updated to 30 on Feb 3 for prod issue
# Reduced to 20 on Feb 10 per team discussion
timeout = 20

✅ Point to commit/issue

# Timeout tuned for production load
# See commit abc123f or issue #456
timeout = 20

Exception: Bundled/distributed code without git access may need key change notes in comments.

Don’t confuse docs with comments

❌ Bad

"""Implementation uses binary search
for O(log n) complexity"""

✅ Good

"""Find user by ID."""
# Use binary search for O(log n) lookup

Don’t write mandated comments

Arbitrary rules like “one comment per 10 lines” or “every function must have a comment” lead to noise, not clarity.

Write comments when they add value, not to meet quotas.

Comic strip showing code commenting guidelines being interpreted as 'write more comments' leading to excessive commenting

Source: Geek & Poke

Don’t write confusing comments

❌ Confusing negative framing

# Skip notification if user is inactive
if user.is_active:
    send_notification(user)

✅ Code is clearer without comment

if user.is_active:
    send_notification(user)

❌ Misleading comment

# Add item to cart
user.remove_item(item_id)

✅ Reading code is faster

# (no comment - code speaks for itself)
user.remove_item(item_id)

Bad comments are worse than no comments. They mislead readers and create confusion.

What to Comment



Whatever helps the reader understand the code more easily: the what, the why, the how

Gandalf the Grey from Lord of the Rings with text 'You shall not pass' - representing your future self reviewing unclear code

Your future self as the reader

Explain the Thought Process

Good comments explain business logic, design decisions, and non-obvious requirements.

❌ Comments reiterate code

# Timeout value is 30 seconds
timeout = 30

# Connect to database
db.connect(retry=True)

# as per spec
if len(data) == 0:
    return None

✅ Comments explain rationale

# Timeout to handle slow network conditions
# in international deployments
timeout = 30

# Retry enabled to handle observed
# transient network failures
db.connect(retry=True)

# Return None for empty datasets per
# https://api-docs.example.com/v2.1#section-4.3
if len(data) == 0:
    return None

Document Non-Obvious Logic

Complex logic benefits from explanation—even when developers could eventually figure it out.

❌ Cryptic regex

pattern = r'^(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$'
if re.match(pattern, password):
    return True

✅ Regex requirements explained

# Password must have: uppercase letter, digit,
# special char (@$!%*?&), min 8 characters
# Note: ASCII-only, no entropy check
pattern = r'^(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$'
if re.match(pattern, password):
    return True

❌ Unexplained bit manipulation

def is_power_of_two(n):
    return n > 0 and (n & (n - 1)) == 0

✅ Logic clarified

def is_power_of_two(n):
    # Powers of 2 have single bit set;
    # n & (n-1) clears lowest bit, yields 0
    return n > 0 and (n & (n - 1)) == 0

Explain Unidiomatic Code

When you intentionally deviate from language conventions, explain why the idiomatic approach doesn’t work and preempt questions.

❌ No explanation for unusual approach

# Process items
result = []
for item in items:
    if item > 0:
        result.append(item * 2)

✅ Explains deviation from idiom

# Use explicit loop to allow early break on error
# (list comprehension evaluates all items upfront)
result = []
for item in items:
    if item > 0:
        result.append(item * 2)

❌ Unidiomatic without context

file = open('data.txt', 'r')
content = file.read()
file.close()

✅ Justifies non-Pythonic code

# Cannot use context manager: file handle must remain
# open for async callback (library limitation)
file = open('data.txt', 'r')
content = file.read()
file.close()

Highlight Known Flaws

It’s acceptable to document known issues, limitations, or future improvements using action comments.

❌ Vague action comment

# TODO: fix this
def process_data(items):
    return [x * 2 for x in items]

✅ Specific with tracking

# TODO: Add validation for empty list (issue #847)
def process_data(items):
    return [x * 2 for x in items]

Common Action Comment Tags

Tag Purpose Example
TODO Planned improvement or missing feature # TODO: Add caching (issue #123)
FIXME Known bug that needs fixing # FIXME: Fails on negative input (#456)
HACK Temporary workaround for a problem # HACK: API bug workaround (ticket #789)
NOTE Important clarification or caveat # NOTE: Must run before init()
OPTIMIZE Performance improvement opportunity # OPTIMIZE: Use binary search (#234)

Tip: Use the Better Comments VS Code extension to highlight these tags in your editor.

Aid Comprehension with Examples

Well-chosen examples often clarify complex code more effectively than detailed comments alone.

❌ Comment without example

# Modulo with negatives wraps backward, not toward zero
index = position % array_length

✅ Comment with examples

# Modulo with negatives wraps backward, not toward zero
# -5 % 3 = 1 (not -2), 5 % 3 = 2
# Useful for circular array indexing
index = position % array_length

❌ Edge cases unclear

# Calculate age, accounting for birthday occurrence
age = today.year - born.year - \
      ((today.month, today.day) < (born.month, born.day))

✅ Examples clarify edge cases

# Calculate age, accounting for birthday occurrence
# Born Feb 29, 2020, today Feb 28, 2024 -> 3 (not 4)
# Born Jan 15, 2020, today Jan 14, 2024 -> 3 (not 4)
age = today.year - born.year - \
      ((today.month, today.day) < (born.month, born.day))

Use Comments as Cautionary Text

Cautionary comments prevent well-intentioned “fixes” that break subtle constraints by warning against seemingly obvious “improvements”.

❌ No warning about dead-end

# Linear search through users
for user in users:
    if user.id == target_id:
        return user

✅ Warns against futile optimization

# Linear search through users
# DON'T optimize: list is always <20 items
# Binary search tested in PR #456: 0.01ms vs 0.009ms
for user in users:
    if user.id == target_id:
        return user

❌ Missing context for constraint

sleep(0.1)
response = api.fetch(url)

✅ Explains critical timing

# WARNING: 100ms delay required by API rate limit
# Removing causes 429 errors (see incident #2341)
sleep(0.1)
response = api.fetch(url)

Explain Security-Sensitive Code

Security comments prevent dangerous “simplifications” that introduce vulnerabilities by explaining why “obvious” optimizations break security boundaries.

❌ No warning about security boundary

# Compare password hashes
if user_hash == provided_hash:
    return True

✅ Explains timing attack protection

# Compare password hashes
# WARNING: Use constant-time comparison
# Regular == leaks timing info (CVE-2023-xxxx)
if secrets.compare_digest(user_hash, provided_hash):
    return True

❌ Unclear sanitization boundary

output = user_input.replace('<', '&lt;')
display(output)

✅ Marks security boundary

# SECURITY: HTML escape before display
# Only '<' escaped; XSS still possible via other chars
# TODO: Use proper HTML sanitization library
output = user_input.replace('<', '&lt;')
display(output)

Document Concurrency Invariants

Concurrency comments prevent race conditions from “optimizations” that remove checks by explaining locking order, invariants, and why seemingly redundant logic matters.

❌ Unexplained lock usage

with self.lock:
    if self.state == 'ready':
        self.state = 'processing'
        return self.process()

Refactoring to “simplify” ↓

❌ Simpler but introduces race condition

# Someone removed "unnecessary" lock to reduce nesting
if self.state == 'ready':
    self.state = 'processing'
    return self.process()

✅ Explains invariant and lock order

# INVARIANT: state transitions require lock
# Lock order: self.lock before db.lock (deadlock)
# Double-check pattern: state may change during wait
with self.lock:
    if self.state == 'ready':
        self.state = 'processing'
        return self.process()

Organize Long Functions

When a function is legitimately complex and shouldn’t be broken down further, use comments to delineate logical sections.

Note: Refactoring into smaller functions is still preferable unless there’s a strong reason (tight I/O coupling, measured performance need, transaction boundaries).

Use section comments to:

  • Group related operations
  • Mark distinct processing phases
  • Improve readability of long functions
  • Guide readers through complex logic
def process_order(order):
    # Validate order data
    ...

    # Calculate pricing and discounts
    ...

    # Check inventory availability
    ...

    # Process payment
    ...

    # Update database and send confirmations
    ...

    return result

How to Comment

Keep them precise and compact.

Use information-dense words

❌ Verbose

# Store function results based on inputs
# to avoid recomputing same values
@save_results
def calculate(x, y):
    return x ** y

✅ Uses technical term

# Memoize expensive computation
@save_results
def calculate(x, y):
    return x ** y

Avoid pronouns

❌ Ambiguous pronouns

# Sync local cache with remote database
# and invalidate it
sync(local_cache, remote_db)
invalidate()

✅ Specific nouns

# Sync local cache with remote database
# and invalidate local cache
sync(local_cache, remote_db)
invalidate()

Explain intent, not implementation

❌ Lower-level details

# Loop through array and add to sum
total = sum(prices)

✅ Higher-level intent

# Calculate order total for tax computation
total = sum(prices)

Keep comments updated

❌ Comment lies after refactor

# Calculate average of all values
return median(values)  # Changed from mean()!

✅ Comment matches code

# Calculate median for robust aggregation
# (changed from mean to handle outliers)
return median(values)

Use specific references

❌ Vague reference

# See the function above
apply_discount()

✅ Clear reference

# Rate logic in pricing.calculate_discount()
apply_discount()

Maintain professional tone

❌ Unprofessional

# This code is a mess
process_legacy_data()

✅ Professional

# Complex legacy integration, needs refactoring
process_legacy_data()

Stay objective

❌ Emotional/subjective

# Stupid requirement from management
validate_input()

✅ Factual

# Business requirement: process within 24h
validate_input()

Avoid inside jokes

❌ Unclear to outsiders

# Here be dragons
process_transaction()

✅ Clear explanation

# Handles concurrent writes with optimistic locking
process_transaction()

Avoid editing banner comments

❌ Editing auto-generated sections

# ============================================
# Auto-generated by config.py v2.3.1
# Last updated: 2024-01-15
# ============================================
# MODIFIED: Changed default timeout
timeout = 60

✅ Link to generator

# Auto-generated by config.py
# See tools/config.py to modify defaults
timeout = 30

Aside: Comment Aesthetics

While content matters most, consistent formatting improves readability.

Proper indentation

def process():
    # Comment matches code indentation
    if condition:
        # Nested comment aligned with code
        execute()

Vertically align end-of-line comments

timeout = 30        # API rate limit
retries = 3         # Network failures
buffer_size = 1024  # Chunk size in bytes

Block comments with breathing room

#
# This is a complex algorithm requiring
# detailed explanation across multiple lines
#
def complex_algorithm():
    ...

Consistent capitalization

# Calculate total price
total = sum(prices)

# Apply discount to premium users
if user.is_premium:
    apply_discount()

Consistent punctuation

# Option 1: No punctuation
# Calculate average
# Filter outliers
# Return result

# Option 2: Full sentences
# Calculate the average of all values.
# Filter out statistical outliers.
# Return the final result.

Tip: Configure your linter/formatter to enforce these aesthetic choices consistently.

Tools & Techniques

“Code never lies, comments sometimes do.”

— Ron Jeffries

Tool Limitations

What tools CAN do:

  • Check comment formatting and style
  • Flag TODO/FIXME comments
  • Detect missing documentation
  • Generate basic API documentation

What they CANNOT do:

  • Understand if comments explain the “why”
  • Assess comment usefulness and clarity
  • Determine if business context is missing
  • Evaluate comment accuracy after code changes
  • Judge whether comments add value

The fundamental limitation

Tools can enforce format but not value. Good commenting requires human judgment about what information is helpful.

AI as an Ally

Why AI tools can help:

  • Analyze code complexity and suggest documentation
  • Identify business logic that needs explanation
  • Check comment clarity and helpfulness

LLM-Generated Comments

LLMs can overcomment or generate redundant comments, but they can also tighten verbose ones.

Human-generated comment:

- # Retry up to 3 times with exponential backoff
- # to handle transient network failures

LLM-suggested improvement:

+ # Retry 3x with exponential backoff for network errors

Always review LLM-generated comments critically.

Code Review

Fresh eyes catch what you miss: outdated comments, missing context, and unclear intent!

Summary

“Good code is its own best documentation.”

— Steve McConnell

Comment Decision Workflow

When encountering unclear code, prioritize refactoring over commenting. Only add comments for inherently non-obvious logic.


“Don’t comment bad code — rewrite it.”

— Kernighan & Plaugher

Workflow in Action

Applying the decision workflow: refactor first, comment only what remains non-obvious.

❌ Poor naming + excessive comments

before.py
def calc(d, t):
    # Initialize result variable
    r = 0
    # Loop through data array
    for i in range(len(d)):
        # Check if threshold is exceeded
        if d[i] >= t:
            # Add to running total
            r = r + d[i]
        # Otherwise skip the value
        else:
            continue
    # Return the final result
    return r

✅ Clear naming + selective comment

after.py
def sum_values_above_threshold(data, threshold):
    # Use >= instead of > due to sensor calibration specs:
    # boundary readings confirmed valid (see ticket #891)
    filtered_values = (value for value in data if value >= threshold)
    return sum(filtered_values)

Transformation: Renamed function/variables + Pythonic idiom → eliminated 6 redundant comments. Added 1 comment explaining non-obvious business decision (>= vs >).

Benefits: Context and Understanding


  • Well-written comments make complex code comprehensible.

  • Writing thoughtful comments forces you to articulate your reasoning and design decisions, improving code quality.

  • Good comments preserve business knowledge and domain expertise for future developers.

  • Consistent commenting practices reduce cognitive overload and make maintenance safer.

Illustration of a cherry on top of a cake, symbolizing that good comments are the finishing touch to well-written code


Invest time in thoughtful comments early—practice makes perfect.

Thank You

And Happy Commenting! 😊



Check out my other slide decks on software development best practices

     

Aside: Cross-Language Examples

Syntax varies by language; the distinction is universal.

Language Documentation Comment Documentation Tools
Python """docstring""" # comment Sphinx, pdoc
JavaScript /** JSDoc */ // comment JSDoc, TypeDoc
TypeScript /** JSDoc */ // comment TypeDoc, TSDoc
Java /** Javadoc */ // comment Javadoc
Kotlin /** KDoc */ // comment Dokka
C /** Doxygen */ /* comment */ Doxygen
C++ /** Doxygen */ // comment Doxygen
Go // Doc comment // comment godoc, pkgsite
Rust /// Doc comment // comment rustdoc
Swift /// Doc comment // comment DocC
R #' Roxygen # comment roxygen2
C# /// XML Doc // comment DocFX, Sandcastle
PHP /** PHPDoc */ // comment phpDocumentor
Ruby # YARD comment # comment YARD, RDoc

References

For a more detailed discussion about how to comment, see the following references:

  • Kernighan, B. W., & Plauger, P. J. (1978). The Elements of Programming Style (2nd ed.). McGraw-Hill. (pp. 117-128)

  • McConnell, S. (2004). Code Complete (2nd ed.). Microsoft Press. (pp. 777-818)

  • Boswell, D., & Foucher, T. (2011). The Art of Readable Code. O’Reilly Media, Inc. (pp. 45-65)

  • Martin, R. C. (2009). Clean Code. Pearson Education. (pp. 53-74)

  • Goodliffe, P. (2007). Code Craft. No Starch Press. (pp. 73-88)

  • Contieri, M. (2023). Clean Code Cookbook. O’Reilly Media, Inc. (pp. 111-121)

  • Gregg, E. (2021). Best practices for writing code comments

  • Google Python Style Guide