feat: Convert to Claude Code plugin marketplace

Transform repository into a plugin marketplace structure with two plugins: - claude-code plugin: Complete toolkit with 5 skills * claude-code-plugins * claude-code-slash-commands * claude-code-hooks * claude-code-subagents * claude-code-memory - claude-skills plugin: Meta-skill for creating Agent Skills * Comprehensive best practices guide * Templates and examples * Progressive disclosure patterns Infrastructure: - Add marketplace.json manifest - Create plugin.json for each plugin - Update documentation for marketplace structure - Add contribution and testing guides Installation: - /plugin install claude-code@claude-skills - /plugin install claude-skills@claude-skills 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-17 11:17:09 -05:00
parent cb6b9f532a
commit 7911d90995
19 changed files with 6112 additions and 0 deletions
--- a/TESTING.md
+++ b/TESTING.md
@@ -0,0 +1,446 @@
+# Testing and Validating Skills
+
+This guide helps you validate skills before adding them to the repository or using them in production.
+
+## Quick Validation Checklist
+
+Run through this checklist before submitting a skill:
+
+```
+Metadata
+[ ] SKILL.md exists
+[ ] YAML frontmatter is valid
+[ ] Name ≤ 64 characters
+[ ] Description ≤ 1024 characters
+[ ] Description includes trigger scenarios
+
+Content Quality
+[ ] "When to Use This Skill" section present
+[ ] At least one concrete example
+[ ] Examples are runnable/testable
+[ ] File references are accurate
+[ ] No sensitive data hardcoded
+
+Triggering Tests
+[ ] Triggers on target scenarios
+[ ] Doesn't trigger on unrelated scenarios
+[ ] No conflicts with similar skills
+
+Security
+[ ] No credentials or API keys
+[ ] No personal information
+[ ] Safe file system access only
+[ ] External dependencies verified
+```
+
+## Detailed Testing Process
+
+### 1. Metadata Validation
+
+#### Test YAML Parsing
+
+Try parsing the frontmatter:
+
+```bash
+# Extract and validate YAML
+head -n 10 SKILL.md | grep -A 3 "^---$"
+```
+
+Verify:
+- YAML is valid (no syntax errors)
+- Both `name` and `description` are present
+- Values are within character limits
+
+#### Character Limits
+
+```bash
+# Count characters in name (must be ≤ 64)
+grep "^name:" SKILL.md | sed 's/name: //' | wc -c
+
+# Count characters in description (must be ≤ 1024)
+grep "^description:" SKILL.md | sed 's/description: //' | wc -c
+```
+
+### 2. Content Quality Testing
+
+#### Check Required Sections
+
+```bash
+# Verify "When to Use This Skill" section exists
+grep -i "when to use" SKILL.md
+
+# Verify examples exist
+grep -i "example" SKILL.md
+```
+
+#### Test File References
+
+If skill references other files, verify they exist:
+
+```bash
+# Find markdown links
+grep -o '\[.*\]([^)]*\.md)' SKILL.md
+
+# Check if referenced files exist
+# (manually verify each one)
+```
+
+#### Validate Examples
+
+For each example in the skill:
+1. Try running the code/commands
+2. Verify output matches expectations
+3. Check for edge cases
+4. Ensure examples are complete (no placeholders)
+
+### 3. Trigger Testing
+
+This is the most important validation step.
+
+#### Create Test Scenarios
+
+**Positive Tests (SHOULD trigger)**
+
+Create a list of scenarios where the skill should activate:
+
+```markdown
+Test Scenario 1: [Describe task that should trigger]
+Expected: Skill activates
+Actual: [Test result]
+
+Test Scenario 2: [Another trigger case]
+Expected: Skill activates
+Actual: [Test result]
+```
+
+**Negative Tests (SHOULD NOT trigger)**
+
+Create scenarios where the skill should NOT activate:
+
+```markdown
+Test Scenario 3: [Similar but different task]
+Expected: Skill does NOT activate
+Actual: [Test result]
+
+Test Scenario 4: [Unrelated task]
+Expected: Skill does NOT activate
+Actual: [Test result]
+```
+
+#### Example Testing Session
+
+For a "Python Testing with pytest" skill:
+
+**Should Trigger:**
+- "Help me write tests for my Python function"
+- "How do I use pytest fixtures?"
+- "Create unit tests for this class"
+
+**Should NOT Trigger:**
+- "Help me test my JavaScript code" (different language)
+- "Debug my pytest installation" (installation, not testing)
+- "Explain what unit testing is" (concept, not implementation)
+
+#### Run Tests with Claude
+
+1. Load the skill
+2. Ask Claude each test question
+3. Observe if skill triggers (check response for skill context)
+4. Document results
+
+### 4. Token Efficiency Testing
+
+#### Measure Content Size
+
+```bash
+# Count tokens (approximate: words × 1.3)
+wc -w SKILL.md
+
+# Or use a proper token counter
+# (tokens ≈ characters ÷ 4 for rough estimate)
+wc -c SKILL.md
+```
+
+#### Evaluate Split Points
+
+Ask yourself:
+- Is content loaded only when needed?
+- Could mutually exclusive sections be split?
+- Are examples concise but complete?
+- Is reference material in separate files?
+
+Target sizes:
+- **SKILL.md**: Under 3000 tokens (core workflows)
+- **Additional files**: Load only when referenced
+- **Total metadata**: ~100 tokens
+
+### 5. Security Validation
+
+#### Automated Checks
+
+```bash
+# Check for potential secrets
+grep -iE "(password|api[_-]?key|secret|token|credential)" SKILL.md
+
+# Check for hardcoded paths
+grep -E "(/Users/|/home/|C:\\\\)" SKILL.md
+
+# Check for sensitive file extensions
+grep -E "\.(key|pem|cert|p12|pfx)( |$)" SKILL.md
+```
+
+#### Manual Review
+
+Review each file for:
+- [ ] No credentials in examples
+- [ ] No personal information
+- [ ] File paths are generic/relative
+- [ ] Network access is documented
+- [ ] External dependencies are from trusted sources
+- [ ] Scripts don't make unsafe system changes
+
+### 6. Cross-Skill Conflict Testing
+
+If you have multiple skills installed:
+
+1. **Similar domain overlap**: Test that specific skills trigger (not generic ones)
+2. **Keyword conflicts**: Check if multiple skills trigger on same query
+3. **Description clarity**: Ensure each skill's domain is distinct
+
+Example conflicts to avoid:
+- "Python Helper" (too generic) vs "Python Testing with pytest" (specific)
+- Both trigger on "Help with Python" → Fix by making descriptions more specific
+
+## Testing Workflows
+
+### Quick Test (5 minutes)
+
+For minor updates or simple skills:
+
+1. ✓ Validate metadata (YAML, character limits)
+2. ✓ Check one example works
+3. ✓ Test one positive trigger
+4. ✓ Test one negative trigger
+5. ✓ Scan for secrets
+
+### Standard Test (15 minutes)
+
+For new skills or significant changes:
+
+1. ✓ Complete metadata validation
+2. ✓ Test all examples
+3. ✓ Run 3-5 trigger tests (positive + negative)
+4. ✓ Check token efficiency
+5. ✓ Full security review
+6. ✓ Verify file references
+
+### Comprehensive Test (30+ minutes)
+
+For complex skills or pre-release:
+
+1. ✓ All standard tests
+2. ✓ Test with different Claude models
+3. ✓ Test conflict scenarios with other skills
+4. ✓ Have someone else try the skill
+5. ✓ Test edge cases in examples
+6. ✓ Review progressive disclosure strategy
+7. ✓ Load test (simulate typical usage)
+
+## Common Issues and Fixes
+
+### Skill Doesn't Trigger
+
+**Symptoms**: Claude doesn't load skill context when expected
+
+**Diagnose**:
+1. Description too vague?
+2. Description missing trigger keywords?
+3. Name too generic?
+
+**Fix**:
+```yaml
+# Before
+description: Python development helpers
+
+# After
+description: Create Python projects using Hatch and Hatchling for dependency management. Use when initializing new Python packages or configuring build systems.
+```
+
+### Skill Triggers Too Often
+
+**Symptoms**: Skill loads for unrelated queries
+
+**Diagnose**:
+1. Description too broad?
+2. Keywords too common?
+
+**Fix**:
+```yaml
+# Add specificity and exclusions
+description: Debug Swift applications using LLDB for crashes, memory issues, and runtime errors. Use when investigating Swift bugs or analyzing app behavior. NOT for general Swift coding or learning.
+```
+
+### Examples Don't Work
+
+**Symptoms**: Users can't reproduce examples
+
+**Diagnose**:
+1. Missing prerequisites?
+2. Placeholders not explained?
+3. Environment-specific code?
+
+**Fix**:
+- Add prerequisites section
+- Make examples self-contained
+- Use generic paths and values
+
+### High Token Usage
+
+**Symptoms**: Skill loads too much content
+
+**Diagnose**:
+1. Too much in SKILL.md?
+2. No progressive disclosure?
+3. Verbose examples?
+
+**Fix**:
+- Split reference material to separate files
+- Link to external resources
+- Condense examples
+- Move advanced content to on-demand files
+
+## Automated Testing (Advanced)
+
+For repositories with many skills, consider automation:
+
+### Validate All Skills
+
+```bash
+#!/bin/bash
+# validate-skills.sh
+
+for skill_dir in */; do
+    if [ -f "$skill_dir/SKILL.md" ]; then
+        echo "Validating $skill_dir..."
+
+        # Check frontmatter exists
+        if ! grep -q "^---$" "$skill_dir/SKILL.md"; then
+            echo "❌ Missing YAML frontmatter"
+        fi
+
+        # Check name length
+        name=$(grep "^name:" "$skill_dir/SKILL.md" | sed 's/name: //')
+        if [ ${#name} -gt 64 ]; then
+            echo "❌ Name too long: ${#name} chars"
+        fi
+
+        # Check for secrets
+        if grep -qiE "(password|api[_-]?key|secret)" "$skill_dir/SKILL.md"; then
+            echo "⚠️  Potential secrets found"
+        fi
+
+        echo "✓ $skill_dir validated"
+    fi
+done
+```
+
+### CI/CD Integration
+
+Add to GitHub Actions or similar:
+
+```yaml
+name: Validate Skills
+on: [push, pull_request]
+
+jobs:
+  validate:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v2
+      - name: Run validation
+        run: |
+          chmod +x validate-skills.sh
+          ./validate-skills.sh
+```
+
+## Documentation Testing
+
+Ensure documentation is accurate:
+
+1. **Links work**: All markdown links resolve
+2. **Paths are correct**: File references are accurate
+3. **Examples are current**: Code samples match latest versions
+4. **Formatting is consistent**: Markdown renders correctly
+
+```bash
+# Check for broken internal links
+grep -r '\[.*\](.*\.md)' . | while read line; do
+    # Extract and verify file exists
+    # (implementation left as exercise)
+done
+```
+
+## User Acceptance Testing
+
+The ultimate test is real usage:
+
+1. **Give skill to others**: Have colleagues test it
+2. **Monitor usage**: See when it triggers in practice
+3. **Gather feedback**: Ask users about clarity and usefulness
+4. **Iterate**: Refine based on real-world usage
+
+## Testing Checklist Template
+
+Copy this for each skill you test:
+
+```markdown
+# Testing Report: [Skill Name]
+
+Date: [YYYY-MM-DD]
+Tester: [Name]
+
+## Metadata
+- [ ] YAML valid
+- [ ] Name ≤ 64 chars
+- [ ] Description ≤ 1024 chars
+- [ ] Trigger scenarios in description
+
+## Content
+- [ ] "When to Use" section present
+- [ ] Examples runnable
+- [ ] File references accurate
+- [ ] No secrets
+
+## Triggering
+Positive tests:
+1. [Scenario] - Result: [ ] Pass [ ] Fail
+2. [Scenario] - Result: [ ] Pass [ ] Fail
+
+Negative tests:
+1. [Scenario] - Result: [ ] Pass [ ] Fail
+2. [Scenario] - Result: [ ] Pass [ ] Fail
+
+## Security
+- [ ] No credentials
+- [ ] No personal data
+- [ ] Safe file access
+- [ ] Dependencies verified
+
+## Overall
+- [ ] Ready for production
+- [ ] Needs revision
+- [ ] Rejected
+
+Notes:
+[Any additional observations]
+```
+
+## Resources
+
+- [claude-skills/SKILL.md](./claude-skills/SKILL.md) - Best practices guide
+- [claude-skills/checklist.md](./claude-skills/checklist.md) - Quality checklist
+- [CONTRIBUTING.md](./CONTRIBUTING.md) - Contribution guidelines
+
+---
+
+**Remember**: Testing isn't just about finding bugs—it's about ensuring your skill provides real value and triggers at the right time.