hwpapi CLAUDE.md
This document captures critical knowledge, patterns, and best practices for working with the hwpapi codebase. Follow these guidelines to work effectively and avoid common pitfalls.
Claude's Guide to Working with hwpapi
This document captures critical knowledge, patterns, and best practices for working with the hwpapi codebase. Follow these guidelines to work effectively and avoid common pitfalls.
π¨ CRITICAL: nbdev Workflow - MUST READ FIRST
The Golden Rule: NEVER Edit .py Files Directly
ALL Python files in hwpapi/ are AUTO-GENERATED from Jupyter notebooks. Editing them directly will result in lost changes.
β WRONG: Edit hwpapi/parametersets.py
β
RIGHT: Edit nbs/02_api/02_parameters.ipynb, then run nbdev_export
File Mapping (Notebook β Python)
| Notebook | Generated Python File |
|----------|----------------------|
| nbs/02_api/00_core.ipynb | hwpapi/core.py |
| nbs/02_api/01_actions.ipynb | hwpapi/actions.py |
| nbs/02_api/02_parameters.ipynb | hwpapi/parametersets.py |
| nbs/02_api/03_classes.ipynb | hwpapi/classes.py |
| nbs/02_api/04_constants.ipynb | hwpapi/constants.py |
| nbs/02_api/06_logging.ipynb | hwpapi/logging.py |
Correct Workflow for Code Changes
# 1. Identify the .py file that needs changes
# 2. Find corresponding .ipynb file (see mapping above)
# 3. Edit the notebook
# 4. Export to regenerate .py files
nbdev_export
# 5. Test your changes
python -m pytest tests/
# 6. Commit BOTH notebook and generated .py file
git add nbs/02_api/02_parameters.ipynb hwpapi/parametersets.py
git commit -m "Your change description"
Warning Signs You're Doing It Wrong
- β οΈ Editing a file with
# AUTOGENERATED! DO NOT EDIT!at the top - β οΈ Changes disappear after running
nbdev_export - β οΈ Notebook and .py file are out of sync
ποΈ Architecture Deep Dive
Core Design Patterns
1. Backend Abstraction Pattern
The codebase uses multiple backends to handle different parameter set types:
# Backend hierarchy
ParameterBackend (Protocol)
βββ PsetBackend # For pset objects (preferred, modern)
βββ HParamBackend # For HParameterSet (legacy)
βββ ComBackend # For generic COM objects
βββ AttrBackend # For plain Python objects
Key Functions:
_is_com(obj)- Checks if object is COM (has_oleobj_or 'com_gen_py')_looks_like_pset(obj)- Checks for pset-specific methods (Item, SetItem, CreateItemSet)make_backend(obj)- Factory that auto-detects and returns appropriate backend
Important: Backend selection is automatic. Trust the factory function.
2. Property Descriptor System
Type-safe properties with automatic validation and conversion:
class CharShape(ParameterSet):
bold = BoolProperty("Bold", "Bold formatting")
fontsize = IntProperty("Size", "Font size in points")
text_color = ColorProperty("TextColor", "Text color")
Property Types:
IntProperty- Integer valuesBoolProperty- Boolean valuesStringProperty- String valuesColorProperty- Hex color β HWP color conversionUnitProperty- π Smart unit conversion (mm, cm, in, pt β HWPUNIT)MappedProperty- String β Integer via mapping dictTypedProperty- Nested ParameterSet (manual)ListProperty- List of values (basic Python lists)NestedProperty- π Auto-creating nested ParameterSet with tab completionArrayProperty- π Auto-creating HArray with list-like interface
Auto-Generated Attributes:
attributes_namesproperty returnslist(self._property_registry.keys())- ParameterSetMeta metaclass auto-populates
_property_registry - NEVER manually set
self.attributes_names = [...]in subclasses
Auto-Creating Properties (New Pattern):
NestedPropertyandArrayPropertyautomatically create underlying COM objects- No manual
create_itemset()or array initialization needed - Full IDE tab completion and type hints
- Lazy creation on first access
3. Staging vs Immediate Mode
Two execution modes:
-
Pset-based (Modern, Preferred)
- Changes apply immediately
- No staging required
- Simpler mental model
-
HSet-based (Legacy)
- Changes are staged first
- Must call
apply()to commit - Supports transactional changes
Code Pattern:
# Check backend type
if isinstance(self._backend, PsetBackend):
# Immediate mode
else:
# Staging mode - accumulate in self._staged
π Common Issues and Solutions
Issue 1: Missing Function Definitions
Symptom: NameError: name '_is_com' is not defined
Cause: Helper function used but not defined in notebook
Solution:
# Add the missing function in the notebook with #| export
#| export
def _is_com(obj: Any) -> bool:
"""Check if object is a COM object."""
if obj is None:
return False
return hasattr(obj, '_oleobj_') or 'com_gen_py' in str(type(obj))
Issue 2: AttributeError with attributes_names
Symptom: AttributeError: property 'attributes_names' of 'X' object has no setter
Cause: Trying to set self.attributes_names = [...] after it became a read-only property
Solution: Define properties instead of setting attributes_names:
# β OLD WAY (broken)
class MyPS(ParameterSet):
def __init__(self):
super().__init__()
self.attributes_names = ["a", "b"]
self.a = None
self.b = None
# β
NEW WAY (correct)
class MyPS(ParameterSet):
a = IntProperty("a", "Value a")
b = IntProperty("b", "Value b")
def __init__(self):
super().__init__()
# attributes_names auto-generated from properties
Issue 3: Backend is None
Symptom: AttributeError: 'NoneType' object has no attribute 'delete'
Cause: Unbound ParameterSet (created without COM object)
Solution: Add None checks:
def _del_value(self, name):
"""Legacy method - use backend instead."""
if self._backend is None:
return False
return self._backend.delete(name)
Issue 4: Import Errors Between Modules
Symptom: Circular import or missing imports
Solution: Check notebook cell order and exports:
- Functions used in cell N must be defined/imported in cells 1 to N-1
- Add
#| exportdirective to export functions - Use
from typing import TYPE_CHECKINGfor type-only imports
Issue 5: Duplicate Class/Method Definitions in Notebook
Symptom: Method doesn't behave as expected after edits; old logic still runs despite changes
Cause: Entire class or methods duplicated in notebook cell (common during copy-paste refactoring)
How to Detect:
# Count occurrences of method definition in generated file
python -c "with open('hwpapi/parametersets.py', encoding='utf-8') as f: print(f.read().count('def _format_int_value'))"
# Output: 2 (should be 1!)
# Find which cell has duplicates in notebook
python -c "import json; nb=json.load(open('nbs/02_api/02_parameters.ipynb', encoding='utf-8')); cell=nb['cells'][26]; print(f'Method appears {cell[\"source\"].count(\"def _format_int_value\")} times in cell 26')"
Solution:
# Identify duplicate boundaries
python << 'EOF'
import json
nb = json.load(open('nbs/02_api/02_parameters.ipynb', encoding='utf-8'))
cell = nb['cells'][26]
source = ''.join(cell['source'])
# Find all method definitions
import re
methods = [(m.start(), m.group(1)) for m in re.finditer(r'\n def (\w+)', source)]
# Look for duplicate method names
from collections import Counter
method_counts = Counter(name for _, name in methods)
duplicates = {name: count for name, count in method_counts.items() if count > 1}
if duplicates:
print(f"DUPLICATES FOUND: {duplicates}")
print("\nRemove duplicate definitions manually from the notebook cell.")
else:
print("No duplicates found.")
EOF
# After removing duplicates, export and verify
nbdev_export
python -c "with open('hwpapi/parametersets.py', encoding='utf-8') as f: print(f'Now has {f.read().count(\"def _format_int_value\")} definition(s)')"
Real Example:
- Entire
ParameterSetclass was duplicated in cell 26 (27,304 characters) - Second
_format_int_valuehad old logic:'Size' in prop_name and 'Font' not in prop_name - First
_format_int_valuehad correct logic:'Size' in prop_name or prop_name.endswith('Size') - Second definition overrode first, causing
FontSizeto display as1200instead of12.0pt - Fix: Removed duplicate from character position 29239 to end of cell
Prevention:
- Always check generated
.pyfile after major refactoring - Use
grep -c "def method_name" hwpapi/parametersets.pyto count definitions - Be careful with copy-paste in notebooks
π οΈ Simplification Strategies
Successfully Applied Simplifications
β Auto-Generate attributes_names (Completed)
Before:
class CharShape(ParameterSet):
def __init__(self):
super().__init__()
self.attributes_names = [
"facename_hangul", "facename_latin", ..., # 67 lines!
]
After:
# In ParameterSet base class
@property
def attributes_names(self):
"""Auto-generated list of attribute names from property registry."""
return list(self._property_registry.keys())
# In subclasses - just define properties, no manual list!
class CharShape(ParameterSet):
facename_hangul = StringProperty("FaceNameHangul", "...")
facename_latin = StringProperty("FaceNameLatin", "...")
# No self.attributes_names needed!
Result: Removed ~500 lines, eliminated maintenance burden
Planned Simplifications (Priority Order)
Priority 1: Unify Backend Modes
- Remove dual staging behavior (immediate vs delayed)
- Pick one model and stick with it
- Impact: ~200 lines saved, 50% complexity reduction
Priority 2: Consolidate Property Types
- Replace 8 property classes with converter pattern
- Use
PropertyDescriptor("key", doc, converter=int) - Impact: ~200 lines saved, removes 6 classes
Priority 3: Remove Forward Declarations
- Use
TYPE_CHECKINGor reorder definitions - Impact: ~25 lines saved, eliminates confusion
π§ͺ Testing Strategy
Test Structure
tests/test_hparam.py
βββ Unit tests (with mocks) - Run without HWP
βββ Integration tests - Require HWP installed
βββ Graceful skipping - Tests skip if dependencies unavailable
Running Tests
# All tests
python -m pytest tests/test_hparam.py -v
# Specific test class
python -m pytest tests/test_hparam.py::TestParameterSetUpdateFrom -v
# Show skipped tests
python -m pytest tests/test_hparam.py -v -ra
Test Requirements
For Unit Tests: Just Python + pytest For Integration Tests:
- Windows OS
- HWP installed
- pywin32
Writing New Tests
import unittest
from hwpapi.parametersets import ParameterSet, IntProperty
class TestMyFeature(unittest.TestCase):
def test_feature(self):
# Use real ParameterSet subclasses that have properties defined
from hwpapi.parametersets import CharShape
ps = CharShape()
ps.bold = True
self.assertEqual(ps.bold, True)
Important: When testing ParameterSet, use classes with actual property descriptors, not manual attributes_names lists.
π Code Patterns to Follow
Pattern 1: Adding a New ParameterSet Class
#| export
class MyParameterSet(ParameterSet):
"""
### MyParameterSet
123) MyParameterSet : λ΄ νλΌλ―Έν°μ
Description of what this parameter set does.
"""
# Define properties (NOT attributes_names)
my_int = IntProperty("MyInt", "Integer value")
my_bool = BoolProperty("MyBool", "Boolean flag")
my_color = ColorProperty("MyColor", "Color value")
def __init__(self, parameterset=None, **kwargs):
super().__init__(parameterset, **kwargs)
# NO self.attributes_names = [...] needed!
# Stage initial values if needed
if 'my_int' in kwargs:
self.my_int = kwargs['my_int']
Pattern 2: Auto-Creating Nested ParameterSets
#| export
class NestedProperty(PropertyDescriptor):
"""
Auto-creating nested ParameterSet property.
Automatically calls CreateItemSet when first accessed.
Example:
class FindReplace(ParameterSet):
find_char_shape = NestedProperty("FindCharShape", "CharShape", CharShape)
pset = FindReplace(action.CreateSet())
pset.find_char_shape.bold = True # Auto-creates! Tab completion works!
"""
def __init__(self, key: str, setid: str, param_class: Type["ParameterSet"], doc: str = ""):
super().__init__(key, doc)
self.setid = setid
self.param_class = param_class
self._cache_attr = f"_nested_cache_{key}"
def __get__(self, instance, owner):
if instance is None:
return self
# Check cache first
if hasattr(instance, self._cache_attr):
return getattr(instance, self._cache_attr)
# Auto-create via CreateItemSet
if instance._backend and hasattr(instance._backend, 'create_itemset'):
nested_pset_com = instance._backend.create_itemset(self.key, self.setid)
nested_wrapped = self.param_class(nested_pset_com)
else:
# Fallback: create unbound instance
nested_wrapped = self.param_class()
# Cache for future access
setattr(instance, self._cache_attr, nested_wrapped)
return nested_wrapped
Usage:
class FindReplace(ParameterSet):
find_string = StringProperty("FindString", "Text to find")
find_char_shape = NestedProperty("FindCharShape", "CharShape", CharShape)
pset = FindReplace(action.CreateSet())
pset.find_char_shape.bold = True # Simple! Tab completion works!
Pattern 3: Adding a Custom Property Type
#| export
class MyProperty(PropertyDescriptor):
"""Custom property with special conversion."""
def __get__(self, instance, owner):
if instance is None:
return self
value = self._get_value(instance)
if value is None:
return self.default
# Your conversion logic
return my_conversion(value)
def __set__(self, instance, value):
if value is None:
self._del_value(instance)
return
# Your validation logic
converted = my_conversion(value)
self._set_value(instance, converted)
Pattern 4: Checking COM Objects
# Always use the helper function
if _is_com(obj):
# Handle COM object
# For pset specifically
if _looks_like_pset(obj):
# Handle pset object
# Let factory decide
backend = make_backend(obj) # Automatic detection
Pattern 5: Handling Optional Backend
def my_method(self):
"""Method that accesses backend."""
if self._backend is None:
# Handle unbound case
return default_value
# Proceed with backend operations
return self._backend.get(self.key)
π Auto-Creating Properties: NestedProperty & ArrayProperty
Overview
Problem: Manual nested parameter set creation is verbose and breaks tab completion:
# β Old way - too complicated!
pset = FindReplace(action.CreateSet())
char_com = pset.create_itemset("FindCharShape", "CharShape")
char_shape = CharShape(char_com)
char_shape.bold = True
Solution: Auto-creating properties that work like regular Python attributes:
# β
New way - simple and intuitive!
pset = FindReplace(action.CreateSet())
pset.find_char_shape.bold = True # Auto-creates! Tab completion works!
NestedProperty - Auto-Creating Nested ParameterSets
Purpose: Automatically create nested parameter sets via CreateItemSet on first access.
Signature:
NestedProperty(key: str, setid: str, param_class: Type[ParameterSet], doc: str = "")
Parameters:
key- Parameter key in HWP (e.g., "FindCharShape")setid- SetID for CreateItemSet call (e.g., "CharShape")param_class- ParameterSet class to wrap (e.g.,CharShape)doc- Documentation string
Example Definition:
class FindReplace(ParameterSet):
"""Find and replace parameters."""
# Simple properties
find_string = StringProperty("FindString", "Text to find")
# Auto-creating nested properties
find_char_shape = NestedProperty("FindCharShape", "CharShape", CharShape,
"Character formatting to find")
find_para_shape = NestedProperty("FindParaShape", "ParaShape", ParaShape,
"Paragraph formatting to find")
Example Usage:
pset = app.actions.repeat_find.create_set()
# Access nested property - auto-creates CharShape via CreateItemSet!
pset.find_char_shape.bold = True
pset.find_char_shape.italic = False
pset.find_char_shape.text_color = "#FF0000"
# IDE provides full tab completion on find_char_shape!
# No manual create_itemset() call needed!
How It Works:
- First access to
pset.find_char_shapetriggersNestedProperty.__get__ - Calls
backend.create_itemset("FindCharShape", "CharShape")to create COM object - Wraps result in
CharShapePython class - Caches instance for future access
- Returns fully-typed instance with all properties available
Benefits:
- β Tab completion - IDE knows exact type and shows all properties
- β No manual creation - CreateItemSet called automatically
- β Type safety - Enforces correct ParameterSet class
- β Cached - Subsequent access returns same instance
- β Lazy - Only created when actually accessed
UnitProperty - Smart Unit Conversion
Purpose: Automatically convert between user-friendly units (mm, cm, in, pt) and HWPUNIT.
Problem: HWPUNIT is not intuitive (1mm = 283 HWPUNIT, 1pt = 100 HWPUNIT)
Solution: Accept familiar units, auto-convert internally
Signature:
UnitProperty(key: str, doc: str,
default_unit: str = "mm",
output_unit: Optional[str] = None,
min_value: Optional[float] = None,
max_value: Optional[float] = None)
Example Definition:
class PageDef(ParameterSet):
"""Page layout."""
# Dimensions in millimeters (most common for paper)
width = UnitProperty("Width", "Page width", default_unit="mm")
height = UnitProperty("Height", "Page height", default_unit="mm")
# Margins in millimeters
left_margin = UnitProperty("LeftMargin", "Left margin", default_unit="mm")
class CharShape(ParameterSet):
"""Character formatting."""
# Font size in points (standard for typography)
fontsize = UnitProperty("Height", "Font size", default_unit="pt")
Example Usage:
# Page dimensions - ALL of these work!
page = PageDef(action.CreateSet())
# String with unit (most explicit)
page.width = "210mm" # A4 width
page.height = "297mm" # A4 height
# Different units (auto-converts)
page.width = "21cm" # Same as 210mm
page.width = "8.27in" # Same as 210mm
# Bare number (uses default_unit = mm)
page.width = 210 # Assumes mm
# Set margins with mixed units
page.left_margin = 25 # 25mm (bare number)
page.right_margin = "2.5cm" # 25mm (converted)
page.top_margin = "1in" # ~25.4mm (converted)
# Get value (returns in mm)
print(f"Width: {page.width}mm") # Output: Width: 210.0mm
# Font size in points
char = CharShape(action.CreateSet())
char.fontsize = 12 # 12pt
char.fontsize = "12pt" # Same
char.fontsize = "4.23mm" # Converts to pt internally
print(f"Font: {char.fontsize}pt") # Output: Font: 12.0pt
Supported Units:
mm- Millimeters (1mm = 283 HWPUNIT) - Default for dimensionscm- Centimeters (1cm = 2830 HWPUNIT)in- Inches (1in = 7200 HWPUNIT)pt- Points (1pt = 100 HWPUNIT) - Default for fonts
Benefits:
- β Intuitive - Use familiar units (210mm instead of 59430 HWPUNIT)
- β Flexible - String "210mm" or number 210 both work
- β Auto-converting - Handles HWPUNIT internally
- β Validated - Optional min/max in user units
- β Standard units - mm for paper, pt for fonts
ArrayProperty - Auto-Creating HArray with List Interface
Purpose: Provide Pythonic list interface for HWP's HArray (PIT_ARRAY) parameters.
Signature:
ArrayProperty(key: str, item_type: Type, doc: str = "",
min_length: Optional[int] = None, max_length: Optional[int] = None)
Parameters:
key- Parameter key in HWP (e.g., "TabStops", "Point")item_type- Type of array elements (int,float,str,tuple, etc.)doc- Documentation stringmin_length- Minimum array length (optional validation)max_length- Maximum array length (optional validation)
Example Definition:
class TabDef(ParameterSet):
"""Tab definition."""
# Array of tab stop positions (in HWPUNIT)
tab_stops = ArrayProperty("TabStops", int, "Tab stop positions")
class BorderFill(ParameterSet):
"""Border and fill settings."""
# Array of 4 border widths: [left, right, top, bottom]
border_widths = ArrayProperty("BorderWidth", int, "Border widths for each side",
min_length=4, max_length=4)
class DrawCoordInfo(ParameterSet):
"""Drawing coordinates."""
# Array of (X, Y) coordinate tuples
points = ArrayProperty("Point", tuple, "Coordinate points")
Example Usage:
# Tab stops
tab_def = TabDef(action.CreateSet())
tab_def.tab_stops = [1000, 2000, 3000, 4000] # Set entire array
tab_def.tab_stops.append(5000) # Standard list method
print(tab_def.tab_stops[0]) # Index access: 1000
# Border widths
border = BorderFill(action.CreateSet())
border.border_widths = [10, 10, 20, 20] # left, right, top, bottom
border.border_widths[2] = 25 # Update top border
# Coordinates
coords = DrawCoordInfo(action.CreateSet())
coords.points = [(0, 0), (100, 100), (200, 50)]
coords.points.append((300, 75))
for i, (x, y) in enumerate(coords.points):
print(f"Point {i}: ({x}, {y})")
List-Like Methods:
# HArrayWrapper provides full list interface:
array.append(item) # Add to end
array.insert(index, item) # Insert at position
array.remove(item) # Remove first occurrence
array.pop(index) # Remove and return
array.clear() # Remove all
array[index] # Get item
array[index] = value # Set item
len(array) # Array length
for item in array: ... # Iteration
How It Works:
- Assignment
array = [...]triggersArrayProperty.__set__ - Creates
HArrayWrapperinstance wrapping COM HArray - Wrapper provides list-like interface
- All modifications sync to underlying HArray
- Full Python list semantics
Benefits:
- β Pythonic - Works exactly like Python lists
- β Tab completion - IDE shows all list methods
- β
Type validation - Ensures all items match
item_type - β Length validation - Optional min/max constraints
- β No COM knowledge needed - Pure Python interface
Complete Example: All Property Types Together
class AdvancedTable(ParameterSet):
"""Table with all property types demonstrated."""
# Simple properties
rows = IntProperty("Rows", "Number of rows")
cols = IntProperty("Cols", "Number of columns")
has_header = BoolProperty("HasHeader", "First row is header")
title = StringProperty("Title", "Table title")
align = MappedProperty("Align", "Alignment", ALIGN_MAP)
# Unit properties - AUTO-CONVERTING!
table_width = UnitProperty("Width", "Table width", default_unit="mm")
table_height = UnitProperty("Height", "Table height", default_unit="mm")
# Array properties - AUTO-CREATING!
column_widths = ArrayProperty("ColWidths", int, "Width of each column in HWPUNIT")
row_heights = ArrayProperty("RowHeights", int, "Height of each row in HWPUNIT")
# Nested property - AUTO-CREATING!
border_fill = NestedProperty("BorderFill", "BorderFill", BorderFill,
"Border and fill settings")
# Usage - everything just works!
table = AdvancedTable(action.CreateSet())
# Simple properties
table.rows = 3
table.cols = 4
table.has_header = True
table.title = "Sales Report"
table.align = "center"
# Unit properties (auto-converts to HWPUNIT)
table.table_width = "150mm" # String with unit
table.table_height = 80 # Bare number, assumes mm
# OR use different units:
table.table_width = "15cm" # Same as 150mm
table.table_width = "5.91in" # Same as 150mm
# Array assignments (auto-creates HArray)
table.column_widths = [2000, 3000, 2500, 2000] # HWPUNIT values
table.row_heights = [1000, 1000, 1000]
# Array modifications
table.column_widths.append(1500)
table.column_widths[2] = 3500
# Nested object access (auto-creates BorderFill via CreateItemSet)
table.border_fill.border_type = "solid"
table.border_fill.fill_color = "#EEEEEE"
# If border_fill has UnitProperty for border widths:
table.border_fill.border_left = "2mm"
table.border_fill.border_right = "0.2cm" # Same as 2mm
# Execute
table.run()
Migration Guide
From TypedProperty to NestedProperty
Before:
class FindReplace(ParameterSet):
find_char_shape = TypedProperty("FindCharShape", "Character formatting", CharShape)
# Usage - manual creation
pset = FindReplace(action.CreateSet())
char_com = pset.create_itemset("FindCharShape", "CharShape")
char_shape = CharShape(char_com)
char_shape.bold = True
After:
class FindReplace(ParameterSet):
find_char_shape = NestedProperty("FindCharShape", "CharShape", CharShape,
"Character formatting")
# Usage - automatic!
pset = FindReplace(action.CreateSet())
pset.find_char_shape.bold = True # Auto-creates!
Migration Steps:
- Change
TypedProperty(key, doc, ParamClass)toNestedProperty(key, setid, ParamClass, doc) - Add
setidparameter (usually matches the class name) - Remove manual
create_itemset()calls in usage code
From ListProperty to ArrayProperty
Before:
class TabDef(ParameterSet):
tab_stops = ListProperty("TabStops", "Tab positions", item_type=int)
# Usage - basic Python list (no COM sync)
tab_def = TabDef()
tab_def.tab_stops = [1000, 2000, 3000]
After:
class TabDef(ParameterSet):
tab_stops = ArrayProperty("TabStops", int, "Tab positions")
# Usage - syncs with HArray
tab_def = TabDef(action.CreateSet())
tab_def.tab_stops = [1000, 2000, 3000] # Syncs to COM HArray
Key Differences:
ArrayPropertyrequires binding to COM object (HArray)ListPropertyis pure Python (no COM sync)- Use
ArrayPropertyfor HWP parameters that are PIT_ARRAY type - Use
ListPropertyfor internal Python-only lists
Property Type Decision Tree
Does this parameter exist in HWP documentation?
ββ NO β Use regular Python attribute or ListProperty
ββ YES β What type is it?
ββ Simple value (int, bool, string) β Use IntProperty, BoolProperty, StringProperty
ββ Enum/mapped value β Use MappedProperty
ββ Nested ParameterSet β Use NestedProperty (auto-creating!)
ββ Array (PIT_ARRAY) β Use ArrayProperty (auto-creating!)
ββ Unit value (HWPUNIT) β Use UnitProperty (auto-converts mm/cm/in/pt!)
ββ Color value β Use ColorProperty
Unit Selection Guide:
- Page/table dimensions β UnitProperty with
default_unit="mm" - Margins/spacing β UnitProperty with
default_unit="mm" - Font size β UnitProperty with
default_unit="pt" - Border widths β UnitProperty with
default_unit="mm" - Line spacing β UnitProperty with
default_unit="pt"or"mm"
Implementation Checklist
When adding auto-creating properties to a ParameterSet class:
For NestedProperty:
- [ ] Identify nested parameter sets in HWP documentation
- [ ] Find the
SetIDforCreateItemSet(usually matches class name) - [ ] Import the nested ParameterSet class
- [ ] Define:
name = NestedProperty(key, setid, ParamClass, doc) - [ ] Test:
pset.name.some_property = valueworks without manual creation
For ArrayProperty:
- [ ] Identify array parameters (PIT_ARRAY type in docs)
- [ ] Determine element type (int, float, str, tuple)
- [ ] Determine length constraints (if any)
- [ ] Define:
name = ArrayProperty(key, item_type, doc, min_length, max_length) - [ ] Test:
pset.name = [...]andpset.name.append(...)work
Best Practices
DO β :
- Use
NestedPropertyfor all nested ParameterSets (notTypedProperty) - Use
ArrayPropertyfor HWP array parameters (notListProperty) - Specify correct
setidmatching HWP documentation - Provide clear documentation strings
- Add type hints for better IDE support
- Test tab completion works in your IDE
DON'T β:
- Don't use
TypedPropertyfor new code (useNestedProperty) - Don't manually call
create_itemset()when usingNestedProperty - Don't use
ListPropertyfor HWP array parameters (useArrayProperty) - Don't forget to bind ParameterSet before accessing auto-creating properties
- Don't mix up
keyandsetidparameters
πΊ ParameterSet Display Enhancements
Overview
The ParameterSet.__repr__() method has been enhanced with three powerful features that create self-documenting, human-readable output. These enhancements work together to make debugging and learning much easier.
Enhancement 1: Human-Readable Value Formatting
Purpose: Convert internal HWP values to intuitive, human-readable formats.
Conversions:
| Property Type | Internal Value | Display Format | Conversion |
|--------------|----------------|----------------|------------|
| Colors | 0x0000FF (BBGGRR) | #FF0000 | BBGGRR β #RRGGBB hex |
| Font Sizes | 1200 (HWPUNIT) | 12.0pt | HWPUNIT Γ· 100 |
| Dimensions | 59430 (HWPUNIT) | 210.0mm | via from_hwpunit() |
| Booleans | True/False | True/False | Direct display |
Implementation:
_format_int_value()method detects property type- Checks property name patterns: 'Size', 'Color', 'Width', 'Height', etc.
- Checks property descriptor type:
ColorProperty,UnitProperty, etc. - Applies appropriate conversion
Example:
pset = CharFormat()
pset.FontSize = 1200
pset.TextColor = 0x0000FF
pset.Width = 59430
print(pset)
# Output:
# CharFormat(
# FontSize=12.0pt
# TextColor="#ff0000"
# Width=210.0mm
# )
Enhancement 2: Enum Display for MappedProperty
Purpose: Show both numeric value and mapped name for enum-like properties.
Format: {numeric_value} ({mapped_name})
How It Works:
- Detects when property descriptor is
MappedProperty - Retrieves raw numeric value from backend or staging dict
- Gets mapped string name from property getter
- Formats as
value (name)
Example:
class BookMark(ParameterSet):
Type = MappedProperty("Type", {
"μΌλ°μ±
κ°νΌ": 0,
"λΈλ‘μ±
κ°νΌ": 1
}, "Bookmark type")
bookmark = BookMark()
bookmark.Type = "λΈλ‘μ±
κ°νΌ" # Set using string name
print(bookmark)
# Output:
# BookMark(
# Type=1 (λΈλ‘μ±
κ°νΌ)
# ...
# )
Benefits:
- See internal numeric value HWP uses
- See human-readable name simultaneously
- Understand enum mappings without checking docs
- Works with any language (Korean, English, etc.)
Common Use Cases:
# Search direction
Direction=0 (down)
Direction=1 (up)
Direction=2 (all)
# Text alignment
Align=0 (left)
Align=1 (center)
Align=2 (right)
Align=3 (justify)
# Bookmark types (Korean)
Type=0 (μΌλ°μ±
κ°νΌ)
Type=1 (λΈλ‘μ±
κ°νΌ)
Enhancement 3: Property Description Comments
Purpose: Display inline documentation for every property.
Format: property=value # description
How It Works:
- Checks if property descriptor has
docattribute - Appends as inline comment after the formatted value
- Works with all property types
Example:
class VideoInsert(ParameterSet):
Base = StringProperty("Base", "λμμ νμΌμ κ²½λ‘")
Format = MappedProperty("Format", {"mp4": 0, "avi": 1}, "λμμ νμ")
Width = IntProperty("Width", "λμμ λλΉ (HWPUNIT)")
video = VideoInsert()
video.Base = "C:/Videos/sample.mp4"
video.Format = "mp4"
video.Width = 59430
print(video)
# Output:
# VideoInsert(
# Base="C:/Videos/sample.mp4" # λμμ νμΌμ κ²½λ‘
# Format=0 (mp4) # λμμ νμ
# Width=210.0mm # λμμ λλΉ (HWPUNIT)
# )
Benefits:
- Self-documenting: No need to check external docs
- Units clarified: Know if it's HWPUNIT, pt, mm, etc.
- Format explained: Understand BBGGRR, enum values, ranges
- Context provided: Hints, constraints, valid values
- Multilingual: Works with Korean and English descriptions
All Three Enhancements Together
Complete Example:
class CharFormat(ParameterSet):
FontName = StringProperty("FontName", "Font family name")
FontSize = IntProperty("FontSize", "Font size in HWPUNIT (100 = 1pt)")
TextColor = ColorProperty("TextColor", "Text color in BBGGRR format")
Bold = BoolProperty("Bold", "Bold formatting")
Underline = MappedProperty("Underline", {
"none": 0, "single": 1, "double": 2
}, "Underline style")
char = CharFormat()
char.FontName = "Arial"
char.FontSize = 1200
char.TextColor = 0x0000FF
char.Bold = True
char.Underline = "single"
print(char)
# Output:
# CharFormat(
# Bold=True # Bold formatting
# FontName="Arial" # Font family name
# FontSize=12.0pt # Font size in HWPUNIT (100 = 1pt)
# TextColor="#ff0000" # Text color in BBGGRR format
# Underline=1 (single) # Underline style
# [staged changes: 5]
# )
Notice:
12.0pt- Human-readable value (Enhancement 1)#ff0000- Color converted to hex (Enhancement 1)1 (single)- Enum shows value + name (Enhancement 2)# Font size...- Description explains everything (Enhancement 3)
Implementation Details
Location: ParameterSet._format_repr() method in nbs/02_api/02_parameters.ipynb (Cell 26)
Key Methods:
def __repr__(self):
"""Return human-readable representation."""
return self._format_repr()
def _format_repr(self, indent=0, max_depth=3):
"""Format ParameterSet with all enhancements."""
# 1. Get all properties from registry
# 2. Format each value based on type
# 3. Add enum display for MappedProperty
# 4. Append description comment
# 5. Return complete formatted string
def _format_int_value(self, prop_name, prop_descriptor, value):
"""Format integer values based on property type."""
# Detect colors, sizes, dimensions
# Apply appropriate conversion
# Return formatted string
Testing:
# Run demos
python examples/nested_property_demo.py
python examples/mapped_property_display_demo.py
python examples/property_description_display_demo.py
Best Practices for Property Definitions
DO β :
# Provide clear, informative descriptions
FontSize = IntProperty("FontSize", "Font size in HWPUNIT (100 = 1pt)")
Width = IntProperty("Width", "Table width in HWPUNIT (283 = 1mm)")
Rows = IntProperty("Rows", "Number of rows (1-500)")
# Include units, formats, ranges in descriptions
TextColor = ColorProperty("TextColor", "Text color in BBGGRR format")
Direction = MappedProperty("Direction", {...}, "Search direction (down=forward, up=backward)")
# Use descriptive enum values
Align = MappedProperty("Align", {
"left": 0,
"center": 1,
"right": 2
}, "Text alignment on page")
DON'T β:
# Don't leave descriptions empty
FontSize = IntProperty("FontSize", "") # No help for users!
# Don't omit units/constraints
Width = IntProperty("Width", "Width") # Width in what unit?
# Don't use cryptic enum values
Mode = MappedProperty("Mode", {
"m1": 0, # What is m1?
"m2": 1 # What is m2?
}, "Mode")
Benefits Summary
For Users:
- β Understand parameters without checking docs
- β See values in familiar units (pt, mm, #RRGGBB)
- β Learn API while debugging
- β Verify correct values are being set
For Developers:
- β Self-documenting code
- β Easier debugging
- β Better error messages possible
- β Reduced support questions
For Documentation:
- β Examples show real, understandable values
- β Screenshots are more informative
- β API is more discoverable
π― Best Practices
DO β
- Always edit notebooks, never .py files
- Run
nbdev_exportafter notebook changes - Test after every change (at least run imports)
- Use property descriptors for new ParameterSet attributes
- Check for None backend in methods that access it
- Trust the backend factory (make_backend)
- Follow existing patterns in similar code
- Add
#| exportdirective to cells that should be exported - Use type hints (already set up with
from __future__ import annotations) - Keep backward compatibility when refactoring
DON'T β
- DON'T edit .py files in hwpapi/ directly
- DON'T set
self.attributes_names = [...]in subclasses - DON'T assume backend is always present (can be None)
- DON'T mix staging modes without understanding
- DON'T add features without tests
- DON'T break existing API without migration path
- DON'T use
isinstancechecks unless necessary - DON'T forget to export functions with
#| export - DON'T commit without running nbdev_export
- DON'T create new COM detection logic (use
_is_com)
π Development Workflow
Making a Change (Step by Step)
# 1. Identify what needs changing
# - Read the .py file to understand current code
# - Find the corresponding .ipynb file
# 2. Edit the notebook
# - Open nbs/02_api/XX_name.ipynb
# - Make your changes in the appropriate cell
# - Add #| export if creating new exported code
# 3. Export to Python
nbdev_export
# 4. Verify the generated code
# - Check that hwpapi/*.py has your changes
# - Look for any warnings or errors
# 5. Test your changes
python -c "import hwpapi; from hwpapi.parametersets import CharShape"
python -m pytest tests/test_hparam.py -v
# 6. Test in actual use (if possible)
python examples/your_example.py
# 7. Commit both files
git add nbs/02_api/XX_name.ipynb hwpapi/name.py
git commit -m "Description of change"
Quick Verification Script
Save this as verify_changes.sh:
#!/bin/bash
echo "=== Running nbdev_export ==="
nbdev_export
echo -e "\n=== Testing imports ==="
python -c "import hwpapi; from hwpapi.parametersets import CharShape, BorderFill; print('β Imports successful')"
echo -e "\n=== Running tests ==="
python -m pytest tests/test_hparam.py -v --tb=short
echo -e "\n=== Checking for common issues ==="
grep -r "self.attributes_names = \[" nbs/ && echo "β Found manual attributes_names!" || echo "β No manual attributes_names"
echo -e "\nβ
Verification complete!"
π Key Reference Information
Important Files
| File | Purpose |
|------|---------|
| settings.ini | nbdev configuration, version, metadata |
| .clinerules | Claude Code spec file (project overview) |
| claude.md | This file - working guidelines |
| PSET_MIGRATION_SUMMARY.md | Context on pset refactoring |
| REFACTORING_SUMMARY.md | Recent refactoring documentation |
| DUPLICATE_FIX_SUMMARY.md | Duplicate class bug fix and display formatting (2025-12-09) |
| AUTO_PROPERTY_DESIGN.md | NestedProperty & ArrayProperty design |
| UNIT_PROPERTY_ENHANCEMENT.md | Smart unit conversion specification |
Key Classes
| Class | Location | Purpose |
|-------|----------|---------|
| App | core.py | High-level API for users |
| Engine | core.py | Mid-level wrapper around HwpObject |
| ParameterSet | parametersets.py | Base class for all parameter sets |
| PropertyDescriptor | parametersets.py | Base for property types |
| ParameterSetMeta | parametersets.py | Metaclass for auto-registration |
Key Functions
| Function | Location | Purpose |
|----------|----------|---------|
| _is_com(obj) | parametersets.py | Check if object is COM |
| _looks_like_pset(obj) | parametersets.py | Check if object is pset |
| make_backend(obj) | parametersets.py | Create appropriate backend |
| resolve_action_args() | parametersets.py | Resolve action arguments |
Environment Variables
# Logging Configuration
HWPAPI_LOG_LEVEL=DEBUG # Set logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
# Default: WARNING (production-friendly, only shows warnings/errors)
# Use DEBUG or INFO for development/troubleshooting
# Examples:
# Development - show all logs
export HWPAPI_LOG_LEVEL=DEBUG
# Production - only warnings and errors (default if not set)
export HWPAPI_LOG_LEVEL=WARNING
# Quiet mode - only errors and critical
export HWPAPI_LOG_LEVEL=ERROR
Important: The default log level is WARNING, which means normal users only see warnings, errors, and critical messages. This is intentional to avoid cluttering output in production. Set HWPAPI_LOG_LEVEL=DEBUG or INFO when you need detailed logging for development or troubleshooting.
π Debugging Tips
Issue: Changes not appearing after nbdev_export
# Force re-export
rm hwpapi/*.py
nbdev_export
# Check for syntax errors in notebook
python -m nbformat.validate nbs/02_api/02_parameters.ipynb
Issue: Import errors
# Check what's exported
python -c "import hwpapi.parametersets; print(dir(hwpapi.parametersets))"
# Check __all__ in generated file
grep "__all__" hwpapi/parametersets.py
Issue: Test failures
# Run with verbose output
python -m pytest tests/test_hparam.py -vv -s
# Run specific test
python -m pytest tests/test_hparam.py::TestClass::test_method -vv
# Show full traceback
python -m pytest tests/test_hparam.py --tb=long
Issue: Notebook corruption
# Validate notebook JSON
python -m nbformat.validate nbs/02_api/02_parameters.ipynb
# If corrupted, restore from git
git checkout nbs/02_api/02_parameters.ipynb
Issue: Duplicate definitions in notebook
# Quick check for duplicate method names in generated file
grep -c "def _format_int_value" hwpapi/parametersets.py # Should be 1
# Find all duplicate methods in a file
python << 'EOF'
import re
from collections import Counter
with open('hwpapi/parametersets.py', encoding='utf-8') as f:
content = f.read()
# Find all method definitions
methods = re.findall(r'\n def (\w+)', content)
method_counts = Counter(methods)
# Show duplicates
duplicates = {name: count for name, count in method_counts.items() if count > 1}
if duplicates:
print("DUPLICATES FOUND:")
for name, count in duplicates.items():
print(f" {name}: {count} times")
else:
print("No duplicates found.")
EOF
# Find which notebook cell has the duplicate
python -c "
import json
nb = json.load(open('nbs/02_api/02_parameters.ipynb', encoding='utf-8'))
for i, cell in enumerate(nb['cells']):
source = ''.join(cell.get('source', []))
count = source.count('def _format_int_value')
if count > 0:
print(f'Cell {i}: {count} definition(s)')
"
# After fixing, verify
nbdev_export
grep -c "def _format_int_value" hwpapi/parametersets.py # Should be 1 now
π‘ Lessons Learned
From Recent Simplifications
-
Auto-generation beats manual maintenance
attributes_namesproperty eliminated 500+ lines- No sync issues between properties and attribute lists
- Single source of truth (property registry)
-
Property system is powerful
- Declarative is better than imperative
- Type safety and validation in one place
- Easier to extend and maintain
-
Edge cases matter
- Always check for None backend
- Handle unbound ParameterSets
- Tests catch these issues
-
nbdev workflow is non-negotiable
- NEVER edit .py files directly
- Always export after notebook changes
- Commit both notebook and generated code
-
Simplification requires careful testing
- Changed behavior can break tests
- Update tests to match new patterns
- Verify with real usage, not just unit tests
-
Human-readable display is valuable
- Raw HWPUNIT/BBGGRR values are not intuitive
- Smart formatting (
_format_int_value) makes debugging easier __repr__showing properties helps users understand ParameterSet state- Context-aware formatting: colors as hex, sizes as pt, dimensions as mm
-
Duplicate detection is critical after refactoring
- Always verify generated
.pyafter major notebook edits - Count method definitions:
grep -c "def method_name" file.py - Duplicates can silently override correct implementations
- Second definition always wins in Python class definitions
- Always verify generated
Pitfalls Encountered
- β Editing .py files β Changes lost on next export
- β Setting
self.attributes_namesβ AttributeError (property has no setter) - β Missing
_is_comdefinition β NameError - β Not checking for None backend β AttributeError
- β Manual attribute lists out of sync β Runtime errors
- β Duplicate class/method definitions in notebook β Second definition overrides first, causing bugs
- β Copy-paste refactoring without checking for duplicates β Hard-to-debug issues
π Understanding the Domain
HWP (Hancom Office)
- Korean word processor (like MS Word for Korea)
- COM automation via
HwpObject - Actions executed via
Run()with parameter sets
win32com Interface
- PyWin32 provides COM bridge
- COM objects have
_oleobj_attribute - Generated COM classes have 'com_gen_py' in type string
Parameter Sets
- Configure HWP actions (like "InsertText", "FindReplace")
- Two flavors: pset (modern) and HSet (legacy)
- Properties map Python names to COM property names
π Codebase Metrics
Current State:
- Total lines: ~15,000
- parametersets.py: ~4,100 lines
- ParameterSet subclasses: 29
- Property descriptors: 438
- Action definitions: 899+
After Priority 2 Simplification:
- Estimated: ~500 lines removed
- Maintenance burden: Significantly reduced
- Complexity: Much lower
ποΈ HWP Object Model: Official vs Current Architecture
Overview
This section compares the official HWP Automation Object Model (from HwpAutomation_2504.pdf) with the current hwpapi implementation to identify gaps, misalignments, and opportunities for better code organization.
Documentation Source: hwp_docs/HwpAutomation_2504.pdf (Korean, dated 2025-04-15)
Official HWP Object Model Structure
The official HWP automation follows a hierarchical object model similar to Microsoft Office:
IHwpObject (Root COM Object)
β
βββ IXHwpDocuments (Collection)
β βββ IXHwpDocument (Single)
β βββ Properties: FullName, Name, Path, Saved, etc.
β βββ Methods: Save(), SaveAs(), Close(), Print(), etc.
β
βββ IXHwpWindows (Collection)
β βββ IXHwpWindow (Single)
β βββ Properties: Width, Height, Left, Top, Active, etc.
β βββ Methods: Activate(), Close(), etc.
β
βββ IXHwpForms (Collection)
β βββ Form Controls (Various types)
β βββ IXHwpFormPushButtons (Collection)
β βββ IXHwpFormCheckButtons (Collection)
β βββ IXHwpFormRadioButtons (Collection)
β βββ IXHwpFormComboBoxes (Collection)
β βββ etc.
β
βββ HAction (Action Execution System)
β βββ GetActionIDByName(name) β ActionID
β βββ Run(ActionID)
β βββ Execute(ActionID, ParameterSet)
β
βββ HParameterSet (Parameter Management)
β βββ CreateItemSet(SetID, ParamIndex) β Creates nested parameter set
β βββ Item(ParamIndex) β Get parameter value
β βββ SetItem(ParamIndex, Value) β Set parameter value
β βββ Clear() β Clear all parameters
β
βββ HSet (Parameter Collection - Legacy)
β βββ Collection of parameters for complex actions
β
βββ HArray (Parameter Arrays - PIT_ARRAY type)
βββ Count β Number of elements
βββ Item(index) β Get element at index
βββ SetItem(index, value) β Set element at index
βββ Add(value) β Append element
βββ RemoveAt(index) β Remove element at index
Key Characteristics:
- Collection Pattern: Documents, Windows, Forms follow "Collection β Single Object" pattern
- Hierarchical Navigation: Document β Sections β Paragraphs β Characters
- Action System: Centralized via HAction.Execute() with HParameterSet
- 900+ Actions: Each with specific parameter requirements
- Type-Safe Parameters: Strongly typed via HParameterSet interface
Current hwpapi Architecture
The current implementation uses a wrapper-based approach with custom patterns:
App (Main Entry Point)
β
βββ Engine
β βββ impl (HwpObject COM object)
β βββ Direct COM access: self.api.MovePos(), self.api.Run(), etc.
β
βββ _Actions (900+ actions as properties)
β βββ CharShape β _Action("CharShape", CharShape parameterset)
β βββ ParaShape β _Action("ParaShape", ParaShape parameterset)
β βββ [899+ more actions...]
β
βββ ParameterSet System (130+ classes in parametersets.py)
β βββ Base: ParameterSet, ParameterSetMeta
β βββ Backend Abstraction:
β β βββ PsetBackend (modern, immediate)
β β βββ HParamBackend (legacy, staging)
β β βββ ComBackend (generic COM)
β β βββ AttrBackend (pure Python)
β βββ Property Descriptors:
β β βββ IntProperty, BoolProperty, StringProperty
β β βββ ColorProperty, UnitProperty
β β βββ MappedProperty, TypedProperty, ListProperty
β β βββ Auto-registration via ParameterSetMeta
β βββ 130+ ParameterSet Subclasses:
β βββ Text/Char: CharShape, ParaShape, BulletShape, etc.
β βββ Tables: Table, Cell, TableCreation, etc.
β βββ Drawing: ShapeObject, DrawLineAttr, DrawImageAttr, etc.
β βββ Document: DocumentInfo, PageDef, SecDef, etc.
β βββ [All mixed in single 3,357-line file]
β
βββ Custom Accessors (Pythonic convenience layer)
β βββ MoveAccessor: Navigation (move.top_of_file(), move.bottom(), etc.)
β βββ CellAccessor: Table cell operations
β βββ TableAccessor: Table operations
β βββ PageAccessor: Page operations
β
βββ Dataclasses (Alternative representation)
βββ Character, CharShape (dataclass)
βββ Paragraph, ParaShape (dataclass)
βββ PageShape (dataclass)
Key Characteristics:
- Flat Entry Point: Single
Appobject, no collections exposed - Action Properties: 900+ actions as dynamic properties on
_Actions - Backend Polymorphism: 4 backend types handle different parameter storage
- Pythonic Wrappers: Custom accessors hide COM complexity
- Monolithic ParameterSets: All 130+ classes in one file
Comparison Matrix
| Aspect | Official HWP Model | Current hwpapi | Alignment |
|--------|-------------------|----------------|-----------|
| Entry Point | IHwpObject COM object | App wrapper around Engine | β
Aligned (wrapped) |
| Document Access | IXHwpDocuments collection | App.api direct access | β Collection pattern not exposed |
| Window Management | IXHwpWindows collection | App.set_visible() only | β οΈ Partial (no multi-window support) |
| Form Controls | IXHwpForms collection | Not exposed | β Missing |
| Action Execution | HAction.Execute(id, pset) | app.actions.ActionName(pset) | β
Aligned (pythonic wrapper) |
| Parameter Sets | HParameterSet COM object | ParameterSet Python classes | β
Well abstracted |
| Parameter Typing | COM types | Python property descriptors | β
Excellent (better than COM) |
| Nested Parameters | CreateItemSet method | NestedProperty auto-creates | β
Enhanced (auto-creating) |
| Arrays (HArray) | COM array methods | ArrayProperty + HArrayWrapper | β
Enhanced (Pythonic list) |
| Navigation | Object hierarchy | Custom accessors | β οΈ Different paradigm |
| Organization | Domain-based modules | Single monolithic file | β Poor organization |
Identified Gaps and Misalignments
1. Missing Collection Objects β
Issue: hwpapi doesn't expose collection objects like IXHwpDocuments, IXHwpWindows, IXHwpForms
Impact:
- Cannot enumerate open documents
- Cannot manage multiple windows
- No access to form controls
- Limits multi-document workflows
Example (What's Missing):
# This is possible in official HWP but not in hwpapi:
documents = hwp.Documents # Collection of all open documents
doc = documents.Item(0) # Get first document
doc.Save() # Save specific document
Current hwpapi:
# Only single document access:
app = App() # Always refers to "current" document
app.save() # Saves "current" document only
2. Monolithic ParameterSets Module β
Issue: All 130+ ParameterSet classes crammed into single 3,357-line file
Impact:
- Hard to navigate and maintain
- No logical grouping by domain
- Merge conflicts in team development
- Slow IDE performance
Breakdown:
parametersets.py (3,357 lines):
βββ Mappings (147 lines): All DIRECTION_MAP, ALIGNMENT_MAP, etc.
βββ Backend System (350 lines): Protocols, backend classes
βββ Property Descriptors (250 lines): IntProperty, BoolProperty, etc.
βββ ParameterSet Base (150 lines): Base class, metaclass
βββ 130+ ParameterSet Classes (2,460 lines):
βββ Text/Character (15 classes): CharShape, ParaShape, BulletShape, etc.
βββ Tables (12 classes): Table, Cell, TableCreation, etc.
βββ Drawing/Shapes (25 classes): ShapeObject, DrawLineAttr, etc.
βββ Document (18 classes): DocumentInfo, PageDef, SecDef, etc.
βββ Find/Replace (5 classes): FindReplace, DocFindInfo, etc.
βββ Forms (8 classes): AutoFill, AutoNum, FieldCtrl, etc.
βββ Formatting (12 classes): BorderFill, Caption, DropCap, etc.
βββ Actions (15 classes): FileOpen, FileSaveAs, Print, etc.
βββ Misc (20 classes): Everything else
3. Navigation Paradigm Mismatch β οΈ
Issue: Official model uses object hierarchy, hwpapi uses position-based accessors
Official Model (Object-Based):
# Hypothetical object-based navigation:
document = app.ActiveDocument
section = document.Sections[0]
paragraph = section.Paragraphs[5]
text = paragraph.Text
Current hwpapi (Position-Based):
# Position-based navigation:
app.move.current_list(para=5, pos=0)
app.actions.CharShape(...)
Analysis: Current approach is more pragmatic for HWP's position-based model. No change needed.
4. Form Controls Not Exposed β
Issue: No access to IXHwpForms, form button controls, etc.
Impact:
- Cannot automate form-based documents
- Cannot create interactive PDFs with forms
- Missing feature parity with official API
Proposed Restructuring Plan
Phase 1: Reorganize ParameterSets Module (High Priority)
Goal: Split monolithic parametersets.py into domain-based submodules
New Structure:
hwpapi/
βββ parametersets/
β βββ __init__.py # Re-export all classes for compatibility
β βββ base.py # ParameterSet base class, metaclass
β βββ backends.py # Backend protocol, implementations
β βββ properties.py # Property descriptors
β βββ mappings.py # All DIRECTION_MAP, ALIGNMENT_MAP, etc.
β βββ text/
β β βββ __init__.py
β β βββ character.py # CharShape, BulletShape
β β βββ paragraph.py # ParaShape, TabDef, ListProperties
β β βββ numbering.py # NumberingShape, AutoNum
β βββ table/
β β βββ __init__.py
β β βββ table.py # Table, TableCreation
β β βββ cell.py # Cell, CellBorderFill
β βββ drawing/
β β βββ __init__.py
β β βββ shape.py # ShapeObject, DrawLayout
β β βββ line.py # DrawLineAttr
β β βββ image.py # DrawImageAttr, DrawImageScissoring
β β βββ effects.py # DrawShadow, DrawRotate, DrawTextart
β βββ document/
β β βββ __init__.py
β β βββ info.py # DocumentInfo, SummaryInfo, VersionInfo
β β βββ page.py # PageDef, PageBorderFill, MasterPage
β β βββ section.py # SecDef, ColDef
β βββ formatting/
β β βββ __init__.py
β β βββ border.py # BorderFill, BorderFillExt
β β βββ caption.py # Caption, FootnoteShape
β β βββ style.py # Style, StyleTemplate
β βββ actions/
β β βββ __init__.py
β β βββ file.py # FileOpen, FileSaveAs, FileConvert
β β βββ edit.py # FindReplace, ConvertCase, etc.
β β βββ print.py # Print, PrintToImage, PrintWatermark
β βββ forms/
β βββ __init__.py
β βββ fields.py # AutoFill, FieldCtrl, HyperLink
Benefits:
- Maintainability: Find CharShape in
text/character.py, not line 850 of monolith - Team Development: Fewer merge conflicts with separated files
- IDE Performance: Faster autocomplete, syntax highlighting
- Logical Grouping: Related classes together by domain
- Backward Compatible: Re-export from
__init__.pypreserves existing imports
Migration:
# Old import (still works):
from hwpapi.parametersets import CharShape, ParaShape, Table
# New import (also works):
from hwpapi.parametersets.text.character import CharShape
from hwpapi.parametersets.text.paragraph import ParaShape
from hwpapi.parametersets.table.table import Table
Estimated Impact:
- Lines reduced: 0 (reorganization, not deletion)
- Files created: ~20 new files
- Maintainability: πΌ Significantly improved
- IDE performance: πΌ Improved
Phase 2: Expose Collection Objects (Medium Priority)
Goal: Expose Documents, Windows collections to match official API
New API:
# Add to App class:
class App:
@property
def documents(self):
"""Access to IXHwpDocuments collection."""
return DocumentsCollection(self.api)
@property
def windows(self):
"""Access to IXHwpWindows collection."""
return WindowsCollection(self.api)
@property
def active_document(self):
"""Currently active document."""
return Document(self.api.ActiveDocument)
# New collection classes:
class DocumentsCollection:
def __init__(self, hwp_object):
self._hwp = hwp_object
def __len__(self):
return self._hwp.Documents.Count
def __getitem__(self, index):
return Document(self._hwp.Documents.Item(index))
def add(self):
"""Create new document."""
return Document(self._hwp.Documents.Add())
class Document:
def __init__(self, doc_com_object):
self._doc = doc_com_object
@property
def full_name(self):
return self._doc.FullName
def save(self):
return self._doc.Save()
def close(self):
return self._doc.Close()
Usage:
app = App()
# Access collections:
print(f"Open documents: {len(app.documents)}")
doc1 = app.documents[0]
doc2 = app.documents[1]
# Multi-document workflows:
for doc in app.documents:
print(doc.full_name)
doc.save()
# Create new document:
new_doc = app.documents.add()
Benefits:
- Feature parity with official API
- Multi-document support
- More explicit than implicit "current document"
- Better for automation scripts
Phase 3: Add Form Controls Support (Low Priority)
Goal: Expose form controls for interactive documents
New Classes:
# Add to App:
class App:
@property
def forms(self):
"""Access to form controls."""
return FormsCollection(self.api)
class FormsCollection:
def __init__(self, hwp_object):
self._hwp = hwp_object
@property
def push_buttons(self):
return PushButtonsCollection(self._hwp.Forms.PushButtons)
@property
def check_buttons(self):
return CheckButtonsCollection(self._hwp.Forms.CheckButtons)
# etc...
Benefits:
- Support form-based documents
- Enable interactive workflows
- Complete API coverage
Restructuring Priorities (Updated)
| Priority | Task | Lines Saved | Complexity Reduction | User Impact | |----------|------|-------------|---------------------|-------------| | 1 | Split parametersets.py by domain | 0 (reorg) | πΌπΌπΌ High | Low (internal) | | 2 | Unify backend modes | ~200 | πΌπΌ Medium | Low (internal) | | 3 | Expose Documents/Windows collections | +150 | π½ Slight increase | πΌπΌ High (feature) | | 4 | Consolidate property types | ~200 | πΌ Medium | Low (internal) | | 5 | Add Form controls support | +200 | π½ Slight increase | πΌ Medium (feature) | | 6 | Remove forward declarations | ~25 | πΌ Small | None |
Recommendation: Start with Priority 1 (split parametersets.py) as it has:
- Highest maintainability impact
- Zero breaking changes
- Easiest to implement (move code, update imports)
Implementation Strategy
Step 1: Prepare New Structure (parametersets/ package)
-
Create directory structure:
mkdir -p hwpapi/parametersets/{text,table,drawing,document,formatting,actions,forms} -
Create
__init__.pyfiles with re-exports:# hwpapi/parametersets/__init__.py from .base import ParameterSet, ParameterSetMeta from .backends import * from .properties import * from .text.character import CharShape, BulletShape from .text.paragraph import ParaShape, TabDef # ... (re-export all classes to preserve imports) __all__ = ['ParameterSet', 'CharShape', 'ParaShape', ...] # Full list -
Move classes to domain files:
- Extract CharShape, BulletShape β
text/character.py - Extract ParaShape, TabDef β
text/paragraph.py - Continue for all 130+ classes
- Extract CharShape, BulletShape β
-
Update notebook:
- Split
nbs/02_api/02_parameters.ipynbinto multiple notebooks:02_parameters_base.ipynbβbase.py02_parameters_text_char.ipynbβtext/character.py- etc.
- Or keep single notebook with clear section markers
- Split
-
Test imports:
# Ensure backward compatibility: from hwpapi.parametersets import CharShape # Should still work
Step 2: Document Migration
- Update CLAUDE.md with new file mappings
- Update README with new structure
- Create migration guide for contributors
Step 3: Gradual Rollout
- Phase 1a: Move base classes, backends, properties (low risk)
- Phase 1b: Move text-related classes (medium risk)
- Phase 1c: Move remaining classes (high risk, test thoroughly)
- Phase 1d: Update documentation, examples
Benefits Summary
Immediate Benefits (Phase 1 - Reorganization):
- β Navigability: Find classes 10x faster
- β Maintainability: Logical grouping by domain
- β Team Collaboration: Fewer merge conflicts
- β IDE Performance: Faster autocomplete
- β Code Reviews: Easier to review focused changes
- β Zero Breaking Changes: Backward compatible via re-exports
Long-term Benefits (Phase 2-3 - New Features):
- β API Completeness: Match official HWP API surface
- β Multi-document Support: Automate across multiple files
- β Form Support: Interactive document automation
- β Better Alignment: Official docs map directly to code structure
Non-Goals:
- β Don't change the property descriptor system (it's excellent)
- β Don't change the backend abstraction (it works well)
- β Don't change the Actions pattern (pythonic and convenient)
- β Don't add complexity for theoretical future needs
Migration Checklist
When implementing the restructuring:
Preparation:
- [ ] Read official HwpAutomation_2504.pdf documentation
- [ ] Understand current parametersets.py organization
- [ ] Create domain-based file structure
Phase 1 Execution:
- [ ] Create
hwpapi/parametersets/package structure - [ ] Split notebooks (or keep single notebook with sections)
- [ ] Move base classes to
base.py - [ ] Move backend classes to
backends.py - [ ] Move property descriptors to
properties.py - [ ] Move mappings to
mappings.py - [ ] Move ParameterSet subclasses to domain files
- [ ] Create
__init__.pywith full re-exports - [ ] Run
nbdev_export - [ ] Test all imports:
python -c "from hwpapi.parametersets import CharShape, ParaShape, Table" - [ ] Run full test suite:
python -m pytest tests/ -v - [ ] Update CLAUDE.md file mapping table
- [ ] Commit changes
Phase 2 Execution (Optional):
- [ ] Design DocumentsCollection, WindowsCollection classes
- [ ] Add
app.documents,app.windowsproperties - [ ] Write tests for multi-document workflows
- [ ] Update documentation with examples
- [ ] Commit changes
Phase 3 Execution (Optional):
- [ ] Design FormsCollection classes
- [ ] Add
app.formsproperty - [ ] Write tests for form controls
- [ ] Update documentation
- [ ] Commit changes
π Version History
Recent Changes
2025-12-09 - Logging System Improvements
- Default Log Level Changed: Changed from
INFOtoWARNING- Production-friendly: Normal users only see warnings, errors, and critical messages
- Reduces log clutter in release builds
- Developers can still enable detailed logging via
HWPAPI_LOG_LEVEL=DEBUGorINFO
- Enhanced Documentation: Added comprehensive logging configuration examples
- Environment Variable:
HWPAPI_LOG_LEVELnow defaults toWARNINGinstead ofINFO - Files:
nbs/02_api/06_logging.ipynb,hwpapi/logging.py - Impact: Cleaner output for end users, opt-in verbose logging for developers
2025-12-09 - Complete Display Enhancement Suite
-
Critical Bug Fix: Removed duplicate ParameterSet class definition in cell 26
- Entire class (27,304 characters) was duplicated
- Second
_format_int_valueoverrode first with old logic - Caused
FontSizeto show as1200instead of12.0pt
-
Enhancement 1: Human-Readable Value Formatting
- Colors:
0x0000FFβ#FF0000(BBGGRR to hex) - Font sizes:
1200β12.0pt(HWPUNIT to pt) - Dimensions:
59430β210.0mm(HWPUNIT to mm) - Booleans: Display as
True/False
- Colors:
-
Enhancement 2: Enum Display for MappedProperty
- Before:
Direction="down" - After:
Direction=0 (down)(shows both value and name) - Works with Korean:
Type=0 (μΌλ°μ± κ°νΌ) - Automatically detects MappedProperty and formats accordingly
- Before:
-
Enhancement 3: Property Description Comments
- Before:
FontSize=12.0pt - After:
FontSize=12.0pt # Font size in HWPUNIT (100 = 1pt) - Shows inline documentation for every property
- Supports multilingual descriptions (Korean, English)
- Before:
-
Complete Example:
CharFormat( FontSize=12.0pt # Font size in HWPUNIT (100 = 1pt) TextColor="#ff0000" # Text color in BBGGRR format Direction=0 (down) # Search direction (down=forward, up=backward) ) -
Detection Tools: Added scripts to detect duplicate method definitions
-
Documentation: Updated CLAUDE.md with Issue 5, debugging tips, examples
-
Result: Self-documenting, human-readable parameter display
-
Files:
nbs/02_api/02_parameters.ipynb,hwpapi/parametersets.py -
Examples:
nested_property_demo.py,mapped_property_display_demo.py,property_description_display_demo.py
2025-01-08 - Auto-Creating Properties Design
- Designed
NestedPropertyfor auto-creating nested ParameterSets - Designed
ArrayPropertyfor Pythonic HArray interface - Enhanced
UnitPropertyfor smart unit conversion (mm, cm, in, pt β HWPUNIT) - Created comprehensive design documents:
- AUTO_PROPERTY_DESIGN.md - NestedProperty & ArrayProperty specification
- UNIT_PROPERTY_ENHANCEMENT.md - Smart unit conversion specification
- Updated CLAUDE.md with complete documentation:
- "Auto-Creating Properties" section with full examples
- UnitProperty section with unit conversion examples
- Migration guides for all property types
- Property type decision tree with unit selection guide
- Key improvements:
- Tab completion for nested properties
- No manual create_itemset() calls
- Pythonic array interface (append, insert, pop, etc.)
- Intuitive units: "210mm", "21cm", "8.27in" instead of HWPUNIT
- Result: Intuitive API that feels natural for Python developers
- Status: Design complete, ready for implementation
2025-01-08 - Architecture Analysis & Restructuring Plan
- Analyzed official HWP Automation Object Model (HwpAutomation_2504.pdf)
- Compared official structure with current hwpapi implementation
- Identified 4 major gaps: Collection objects, monolithic parametersets.py, form controls, organization
- Designed 3-phase restructuring plan:
- Phase 1: Split parametersets.py into domain-based modules (highest priority)
- Phase 2: Expose Documents/Windows collections (medium priority)
- Phase 3: Add Form controls support (low priority)
- Added comprehensive "HWP Object Model: Official vs Current Architecture" section to CLAUDE.md
- Result: Clear roadmap for better alignment with official API and improved maintainability
2024 - Auto-Generated attributes_names
- Removed manual
self.attributes_names = [...]from 9+ classes - Added
@property attributes_namesto ParameterSet - Fixed
_del_valueto handle None backend - Updated tests to use property descriptors
- Result: -500 lines, eliminated sync issues
2024 - Added _is_com Function
- Fixed NameError in
_looks_like_pset - Added proper COM object detection
- Now properly distinguishes COM vs non-COM objects
Earlier - Pset Migration
- See PSET_MIGRATION_SUMMARY.md
- Migrated from HSet-based to pset-based approach
- Maintained backward compatibility
β Checklist for New Contributors
Before making changes:
- [ ] Read this entire document
- [ ] Understand nbdev workflow (CRITICAL)
- [ ] Know the file mapping (notebook β .py)
- [ ] Set up development environment
- [ ] Can run
nbdev_exportsuccessfully - [ ] Can run tests
Before committing:
- [ ] Edited notebook, NOT .py file
- [ ] Ran
nbdev_export - [ ] Ran tests (
python -m pytest tests/ -v) - [ ] Verified imports work
- [ ] Added/updated tests if needed
- [ ] Committing both .ipynb and .py files
- [ ] Commit message describes what and why
π Quick Reference Commands
# Export notebooks to Python
nbdev_export
# Run all tests
python -m pytest tests/ -v
# Test specific module
python -m pytest tests/test_hparam.py -v
# Verify imports
python -c "import hwpapi; print('OK')"
# Check notebook validity
python -m nbformat.validate nbs/02_api/02_parameters.ipynb
# Generate documentation
nbdev_docs
# Clean generated files
nbdev_clean
# Preview documentation
nbdev_preview
π― Remember
- This is nbdev: Source is in notebooks, .py files are generated
- Backend abstraction works: Trust
make_backend()factory - Properties are auto-registered: No manual
attributes_namesneeded - Always check for None: Backend might not be initialized
- Test your changes: Don't break existing functionality
- Keep it simple: Prefer simplification over clever abstractions
- Document decisions: Update this file when you learn something new
This document is a living guide. Update it as you learn more about the codebase.
Last Updated: 2025-12-09 (After fixing duplicate class and display formatting) Next Review: After Phase 1 restructuring (split parametersets.py by domain)