Skip to content

Commit ae3fb89

Browse files
authored
Merge pull request #785 from CDLUC3/bug/JS-fix-issue-with-docx-download-fonts
Switched to using pandoc to address issue with font sizes in downloading .docx files
2 parents b88a372 + 0030b38 commit ae3fb89

File tree

7 files changed

+157
-11
lines changed

7 files changed

+157
-11
lines changed

Dockerfile

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,8 @@ RUN apt-get -qqy update \
4040
shared-mime-info \
4141
nodejs -qqy \
4242
chromium \
43-
&& rm -rf /var/lib/apt/lists/*
43+
pandoc \
44+
&& rm -rf /var/lib/apt/lists/*
4445

4546
# Always run Node in Production for the ECS hosted environments
4647
ENV NODE_ENV=production

Dockerfile.dev

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ RUN set -ex \
1616
libssl-dev libstdc++6 libtool libx11-6 libx11-xcb1 libxcb1 libxcomposite1 \
1717
libxcursor1 libxdamage1 libxext6 libxfixes3 libxi6 libxrandr2 \
1818
libxrender1 libxss1 libxtst6 libyaml-dev locales lsb-release mtr \
19+
pandoc \
1920
shared-mime-info tzdata vim wget xdg-utils xfonts-base \
2021
xfonts-75dpi xz-utils yarn \
2122
python3 make \

app/controllers/plan_exports_controller.rb

Lines changed: 34 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -117,10 +117,40 @@ def show_text
117117
end
118118

119119
def show_docx
120-
# Using and optional locals_assign export_format
121-
render docx: "#{file_name}.docx",
122-
content: clean_html_for_docx_creation(render_to_string(partial: 'shared/export/plan',
123-
locals: { export_format: 'docx' }))
120+
# Use Pandoc for better HTML->DOCX conversion (handles CSS font-size correctly)
121+
begin
122+
html = render_to_string(partial: 'shared/export/plan', locals: { export_format: 'docx' })
123+
124+
docx_path = Rails.root.join('tmp', "#{file_name}.docx")
125+
html_path = Rails.root.join('tmp', "#{file_name}.html")
126+
127+
# Write HTML to temp file for Pandoc
128+
File.write(html_path, html)
129+
130+
# Convert using Pandoc with reference document for styling
131+
reference_doc = Rails.root.join('lib', 'templates', 'pandoc_reference.docx')
132+
result = system('pandoc', '-f', 'html', '-t', 'docx',
133+
"--reference-doc=#{reference_doc}",
134+
'-o', docx_path.to_s, html_path.to_s)
135+
136+
if result && File.exist?(docx_path)
137+
send_data File.read(docx_path, mode: 'rb'),
138+
filename: "#{file_name}.docx",
139+
type: 'application/vnd.openxmlformats-officedocument.wordprocessingml.document'
140+
else
141+
raise "Pandoc conversion failed"
142+
end
143+
rescue StandardError => e
144+
Rails.logger.error("Unable to generate DOCX with Pandoc: #{e.message}")
145+
# Fallback to htmltoword method if pandoc fails for any reason
146+
render docx: "#{file_name}.docx",
147+
content: clean_html_for_docx_creation(render_to_string(partial: 'shared/export/plan',
148+
locals: { export_format: 'docx' }))
149+
ensure
150+
# Cleanup temp files
151+
File.delete(html_path) if File.exist?(html_path)
152+
File.delete(docx_path) if File.exist?(docx_path)
153+
end
124154
end
125155

126156
def show_pdf

app/views/branded/shared/export/_plan.erb

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,10 @@
2222
<% if @hash[:all_phases] || (@selected_phase.present? && phase[:title] == @selected_phase.title) %>
2323
<%# Page break before each phase %>
2424
<div style="page-break-before:always;"></div>
25-
<h1><%= download_plan_page_title(@plan, phase, @hash) %></h1>
25+
<%# Skip H1 for DOCX - Pandoc converts HTML <title> to Word Title style, making H1 redundant %>
26+
<% unless local_assigns[:export_format] == 'docx' %>
27+
<h1><%= download_plan_page_title(@plan, phase, @hash) %></h1>
28+
<% end %>
2629
<% phase[:sections].each do |section| %>
2730
<% if display_section?(@hash[:customization], section, @show_custom_sections) && num_section_questions(@plan, section, phase) > 0 %>
2831
<% if @show_sections %>
@@ -42,7 +45,8 @@
4245

4346
<% if @show_questions %>
4447
<% if (@show_unanswered && blank) || !blank %>
45-
<h3><%= sanitize question[:text].to_s, scrubber: TableFreeScrubber.new %></h3>
48+
<%# Strip <p> tags to prevent invalid HTML structure (<h3><p>...</p></h3>) %>
49+
<h3><%= sanitize(question[:text].to_s, scrubber: TableFreeScrubber.new).gsub(/<\/?p>/, '').html_safe %></h3>
4650
<% end %>
4751
<% end %>
4852

app/views/shared/export/_plan.erb

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -22,8 +22,11 @@
2222
<% if @hash[:all_phases] || (@selected_phase.present? && phase[:title] == @selected_phase.title) %>
2323
<%# Page break before each phase %>
2424
<div style="page-break-before:always;"></div>
25-
<h1><%= download_plan_page_title(@plan, phase, @hash) %></h1>
26-
<hr />
25+
<%# Skip H1 for DOCX - Pandoc converts HTML <title> to Word Title style, making H1 redundant %>
26+
<% unless local_assigns[:export_format] == 'docx' %>
27+
<h1><%= download_plan_page_title(@plan, phase, @hash) %></h1>
28+
<hr />
29+
<% end %>
2730
<% phase[:sections].each do |section| %>
2831
<% if display_section?(@hash[:customization], section, @show_custom_sections) && num_section_questions(@plan, section, phase) > 0 %>
2932
<% if @show_sections %>
@@ -41,11 +44,12 @@
4144
<% blank = answer.present? ? answer.blank? : true %>
4245
<% options = answer.present? ? answer.question_options : [] %>
4346
<% if @show_unanswered %>
47+
<%# Strip <p> tags to prevent invalid HTML (<strong><p>...</p></strong> or <div><p>...</p></div>) %>
4448
<%# Hack: for DOCX export - otherwise, bold highlighting of question inconsistent. %>
4549
<% if local_assigns[:export_format] && export_format == 'docx' %>
46-
<strong><%= sanitize question[:text].to_s, scrubber: TableFreeScrubber.new %></strong>
50+
<strong><%= sanitize(question[:text].to_s, scrubber: TableFreeScrubber.new).gsub(/<\/?p>/, '').html_safe %></strong>
4751
<% else %>
48-
<div class="bold"><%= sanitize question[:text].to_s, scrubber: TableFreeScrubber.new %></div>
52+
<div class="bold"><%= sanitize(question[:text].to_s, scrubber: TableFreeScrubber.new).gsub(/<\/?p>/, '').html_safe %></div>
4953
<% end %>
5054
<br>
5155
<% end %>

lib/templates/pandoc.md

Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
# Pandoc DOCX Reference Document
2+
3+
The file `pandoc_reference.docx` is used as a style template when generating `.docx` exports. Pandoc uses the styles defined in this file while filling in content from the app. It does **not** use the content of the reference doc, only its styles.
4+
5+
## How it works
6+
7+
When `show_docx` runs in `plan_exports_controller.rb`, it passes this file to `pandoc` via `--reference-doc`. Pandoc maps HTML elements to Word styles like so:
8+
9+
| HTML element | Word style |
10+
|---|---|
11+
| `<p>` | Body Text / First Paragraph |
12+
| `<h1>``<h6>` | Heading 1 – Heading 6 |
13+
| `<blockquote>` | Block Text |
14+
| `<code>` | Verbatim Char |
15+
| `<a>` | Hyperlink |
16+
| `<hr>` | Horizontal rule paragraph (avoid — use pBdr on heading instead) |
17+
18+
## Editing styles
19+
20+
The styles are stored in `word/styles.xml` inside the `.docx` file. Since `.docx` files are zip archives, you need to unzip, edit, and rezip.
21+
22+
### 1. Unzip
23+
24+
```bash
25+
cd lib/templates
26+
unzip pandoc_reference.docx -d pandoc_ref_tmp
27+
```
28+
29+
### 2. Edit
30+
31+
Open `pandoc_ref_tmp/word/styles.xml` in any text editor and find the style you want to change.
32+
33+
Font sizes use half-points, so:
34+
35+
| Value | Point size |
36+
|---|---|
37+
| 20 | 10pt |
38+
| 22 | 11pt |
39+
| 24 | 12pt |
40+
| 26 | 13pt |
41+
| 28 | 14pt |
42+
| 32 | 16pt |
43+
44+
Spacing values (e.g. `w:before`, `w:after`) use twentieths of a point (twips), so 240 = 12pt of space.
45+
46+
Common styles to edit:
47+
48+
- **`docDefaults`** — fallback font and size for anything not explicitly styled
49+
- **`BodyText`** — regular paragraphs
50+
- **`FirstParagraph`** — first paragraph after a heading
51+
- **`Heading1``Heading9`** — headings (also update the matching `Heading1Char` etc.)
52+
53+
### 3. Rezip
54+
55+
Run the zip command from **inside** the tmp folder — this is important, otherwise the folder itself gets included and Pandoc won't read the file correctly.
56+
57+
```bash
58+
cd pandoc_ref_tmp
59+
zip -r ../pandoc_reference.docx .
60+
cd ..
61+
rm -rf pandoc_ref_tmp
62+
```
63+
64+
### 4. Deploy
65+
66+
Commit `pandoc_reference.docx` and deploy. Changes take effect immediately on the next export.
67+
68+
## Editing styles using Word
69+
70+
You can open `pandoc_reference.docx` directly in Word and modify styles visually, though it requires a few extra steps to make sure changes actually persist to the style definitions (not just the sample text).
71+
72+
### 1. Open the file in Word
73+
74+
Open `pandoc_reference.docx` directly. You will see sample text for each style, e.g. "Body Text.", "First Paragraph.", "Heading 1", etc.
75+
76+
### 2. Open the Styles pane
77+
78+
- **Mac:** Go to **Format** menu → **Style...****Modify**
79+
- **Windows:** On the **Home** tab, click the small arrow at the bottom-right corner of the Styles group
80+
81+
### 3. Modify a style
82+
83+
In the Styles pane, hover over the style you want to change (e.g. **Body Text**) until a dropdown arrow appears on the right side. Click it and choose **Modify Style...**. Change the font, size, spacing, etc. and click **OK**.
84+
85+
Do **not** right-click the sample text in the document body itself — that only affects the text, not the underlying style definition.
86+
87+
### 4. Save
88+
89+
Save the file normally. Word should write your style changes into the underlying XML.
90+
91+
### 5. Verify the change was saved
92+
93+
Because Word sometimes silently fails to persist style changes, it is worth verifying by unzipping and inspecting the XML after saving:
94+
95+
```bash
96+
cd lib/templates
97+
unzip -p pandoc_reference.docx word/styles.xml | grep -A 10 '"Body Text"'
98+
```
99+
100+
Check that `<w:sz w:val="..."/>` reflects the size you set. If it does not, the change did not save and you will need to edit the XML directly instead (see **Editing styles** above).
101+
102+
## Tips
103+
104+
- Always back up the file before editing: `cp pandoc_reference.docx pandoc_reference_backup.docx`
105+
- To verify a style was saved correctly, unzip and grep: `unzip -p pandoc_reference.docx word/styles.xml | grep -A 10 '"Body Text"'`
106+
- Avoid using `<hr>` in the HTML template. Instead, add a bottom border to the Heading style using `<w:pBdr>` in `styles.xml` to avoid extra spacing.
15.9 KB
Binary file not shown.

0 commit comments

Comments
 (0)