Skip to content

Commit 67796d8

Browse files
authored
Merge pull request #1 from atakuzi/claude/extract-publish-posts-NPn6q
Add blog post conversion script and summary
2 parents b28380a + 9222dc7 commit 67796d8

File tree

2 files changed

+225
-0
lines changed

2 files changed

+225
-0
lines changed

CONVERSION_SUMMARY.md

Lines changed: 102 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,102 @@
1+
# System Design 101 to Blog Posts Conversion Summary
2+
3+
## Overview
4+
5+
Successfully extracted and converted **400 system design guides** from this repository into individual Jekyll blog posts for publishing on atakuzi.github.io.
6+
7+
## What Was Done
8+
9+
### 1. Repository Analysis
10+
- Explored the system-design-101 repository structure
11+
- Found 400 markdown guides in `/data/guides/` directory
12+
- Analyzed the existing atakuzi.github.io Jekyll site structure
13+
14+
### 2. Conversion Script
15+
Created `/home/user/convert_guides.py` which:
16+
- Extracts frontmatter from each guide
17+
- Converts to Jekyll-compatible format with proper frontmatter:
18+
- `layout: post`
19+
- `title` from original guide
20+
- `subtitle` from description
21+
- `date` from createdAt field
22+
- `tags` from original tags
23+
- Preserves all markdown content and images
24+
- Names files with proper Jekyll convention: `YYYY-MM-DD-title.md`
25+
26+
### 3. Conversion Results
27+
- **Total guides converted**: 400
28+
- **Success rate**: 100% (400/400)
29+
- **Failed conversions**: 0
30+
- **Output location**: `/home/user/atakuzi.github.io/_posts/`
31+
32+
### 4. Content Categories Included
33+
The converted posts cover:
34+
- API and Web Development (78 guides)
35+
- Real World Case Studies (32 guides)
36+
- AI and Machine Learning (8 guides)
37+
- Database and Storage (48 guides)
38+
- Technical Interviews (5 guides)
39+
- Caching & Performance (29 guides)
40+
- Payment and Fintech (16 guides)
41+
- Software Architecture (18 guides)
42+
- DevTools & Productivity (19 guides)
43+
- Software Development (29 guides)
44+
- Cloud & Distributed Systems (50 guides)
45+
- How it Works? (14 guides)
46+
- DevOps and CI/CD (28 guides)
47+
- Security (33 guides)
48+
- Computer Fundamentals (13 guides)
49+
50+
## Current Status
51+
52+
### ✅ Completed
53+
1. Cloned atakuzi.github.io repository
54+
2. Created conversion script
55+
3. Converted all 400 guides successfully
56+
4. Committed changes locally to atakuzi.github.io
57+
58+
### ⚠️ Pending
59+
The changes are committed locally but need to be pushed to the remote repository. This requires GitHub authentication.
60+
61+
### Commit Details
62+
- **Repository**: atakuzi.github.io
63+
- **Branch**: main
64+
- **Commit Hash**: 84bb248
65+
- **Commit Message**: "Add 400 system design guides from ByteByteGo"
66+
- **Files Added**: 400 new blog posts
67+
- **Insertions**: 13,127 lines
68+
69+
## Next Steps
70+
71+
To complete the publication, you need to push the changes from the atakuzi.github.io repository:
72+
73+
```bash
74+
cd /home/user/atakuzi.github.io
75+
git push origin main
76+
```
77+
78+
Note: This requires GitHub authentication (personal access token or SSH key).
79+
80+
## Sample Posts Created
81+
82+
Here are a few examples of the converted posts:
83+
- `2024-02-22-top-5-caching-strategies.md`
84+
- `2024-01-28-system-design-cheat-sheet.md`
85+
- `2024-03-15-how-does-docker-work.md`
86+
- `2024-03-13-rest-api-cheatsheet.md`
87+
- `2024-02-12-100x-postgres-scaling-at-figma.md`
88+
89+
## Files and Tools
90+
91+
- **Conversion Script**: `/home/user/convert_guides.py`
92+
- **Source Directory**: `/home/user/system-design-101/data/guides/`
93+
- **Output Directory**: `/home/user/atakuzi.github.io/_posts/`
94+
- **Blog Site**: atakuzi.github.io (Jekyll/GitHub Pages)
95+
96+
## Notes
97+
98+
- All posts include original images from ByteByteGo CDN
99+
- Post dates preserved from original `createdAt` field
100+
- All posts maintain original markdown formatting
101+
- Tags and categories preserved for proper organization
102+
- The blog will automatically build and deploy once pushed to GitHub (via GitHub Pages)

scripts/convert_guides.py

Lines changed: 123 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,123 @@
1+
#!/usr/bin/env python3
2+
"""
3+
Convert system-design-101 guides to Jekyll blog posts for atakuzi.github.io
4+
"""
5+
6+
import os
7+
import re
8+
from pathlib import Path
9+
from datetime import datetime
10+
11+
def extract_frontmatter(content):
12+
"""Extract YAML frontmatter from markdown content."""
13+
match = re.match(r'^---\n(.*?)\n---\n(.*)$', content, re.DOTALL)
14+
if match:
15+
frontmatter = match.group(1)
16+
body = match.group(2)
17+
return frontmatter, body
18+
return None, content
19+
20+
def parse_frontmatter(frontmatter_text):
21+
"""Parse YAML frontmatter into a dictionary."""
22+
data = {}
23+
for line in frontmatter_text.split('\n'):
24+
if ':' in line:
25+
key, value = line.split(':', 1)
26+
key = key.strip()
27+
value = value.strip().strip('"\'')
28+
if key == 'categories' or key == 'tags':
29+
# Skip, will process next
30+
continue
31+
data[key] = value
32+
elif line.strip().startswith('- '):
33+
# This is a list item
34+
item = line.strip()[2:].strip('"\'')
35+
if 'tags' not in data:
36+
data['tags'] = []
37+
data['tags'].append(item)
38+
return data
39+
40+
def create_jekyll_frontmatter(original_data):
41+
"""Create Jekyll-compatible frontmatter."""
42+
title = original_data.get('title', 'Untitled')
43+
description = original_data.get('description', '')
44+
created_at = original_data.get('createdAt', datetime.now().strftime('%Y-%m-%d'))
45+
tags = original_data.get('tags', [])
46+
47+
# Build the frontmatter
48+
frontmatter = f"""---
49+
layout: post
50+
title: "{title}"
51+
subtitle: "{description}"
52+
date: {created_at}
53+
tags: {tags}
54+
---
55+
"""
56+
return frontmatter
57+
58+
def convert_guide_to_post(guide_path, output_dir):
59+
"""Convert a single guide file to Jekyll post format."""
60+
# Read the guide content
61+
with open(guide_path, 'r', encoding='utf-8') as f:
62+
content = f.read()
63+
64+
# Extract and parse frontmatter
65+
frontmatter_text, body = extract_frontmatter(content)
66+
if not frontmatter_text:
67+
print(f"Warning: No frontmatter found in {guide_path}")
68+
return None
69+
70+
original_data = parse_frontmatter(frontmatter_text)
71+
72+
# Create Jekyll frontmatter
73+
jekyll_frontmatter = create_jekyll_frontmatter(original_data)
74+
75+
# Combine frontmatter and body
76+
jekyll_content = jekyll_frontmatter + '\n' + body
77+
78+
# Generate output filename
79+
created_at = original_data.get('createdAt', datetime.now().strftime('%Y-%m-%d'))
80+
guide_name = Path(guide_path).stem
81+
output_filename = f"{created_at}-{guide_name}.md"
82+
output_path = output_dir / output_filename
83+
84+
# Write the Jekyll post
85+
with open(output_path, 'w', encoding='utf-8') as f:
86+
f.write(jekyll_content)
87+
88+
return output_path
89+
90+
def main():
91+
"""Main conversion function."""
92+
# Define paths
93+
guides_dir = Path('/home/user/system-design-101/data/guides')
94+
output_dir = Path('/home/user/atakuzi.github.io/_posts')
95+
96+
# Get all guide files
97+
guide_files = sorted(guides_dir.glob('*.md'))
98+
99+
print(f"Found {len(guide_files)} guide files to convert")
100+
101+
converted = 0
102+
failed = 0
103+
104+
for guide_file in guide_files:
105+
try:
106+
result = convert_guide_to_post(guide_file, output_dir)
107+
if result:
108+
converted += 1
109+
if converted % 50 == 0:
110+
print(f"Converted {converted} files...")
111+
else:
112+
failed += 1
113+
except Exception as e:
114+
print(f"Error converting {guide_file}: {e}")
115+
failed += 1
116+
117+
print(f"\n✅ Conversion complete!")
118+
print(f" Successfully converted: {converted}")
119+
print(f" Failed: {failed}")
120+
print(f" Output directory: {output_dir}")
121+
122+
if __name__ == '__main__':
123+
main()

0 commit comments

Comments
 (0)