POSTS
Multi Author Docs
Summary
We all know that TeX has large use cases, such as research math and physics. This post is about how TeX might help some small niche use cases. We study one example, and suggest how it might scale to others. The scaling depend on non-TeX software and systems, such as web frameworks, and server / cloud computing.
Introduction
We're social creatures. Mostly, we live in communities. We speak and listen. We use language. Fortunately, most of us read and write. Multiple author documents have a special place in our culture and history. For example, religion, laws and reference works. And also standards and source code for computer software.
We're also tool makers and users, with imagination. This post is about making it easier for a group to come together to produce a multiple author document. It's based on a discussion I had with Gernot Salzer at a recent TeX Office Hour, on how he and his colleagues at TU Wien produce curricula for about 100 academic programs.
If you're interested do drop in to my TeX Office Hour (every Thursday evening 6:30 to 7:30pm UK time).
The TU Wien example
TU Wien (see Links below) is one of the major universities in Austria. It's a technical university in Vienna. It has about 28,000 students and 5,000 staff, of whom 3,800 are academics. It focuses on engineering, computer science and natural sciences. Its motto Technik für Menschen (Technology for People) could also be the motto for this article.
It offers about 100 academic programs. I'll review the curriculum for one of them (see Links below). You might like to take a look at it now, or after you've read this post.
This curriculum has 91 pages. Its typography is quite simple - sections, paragraphs, lists, and a few tables. There are also some footnotes and semantic change of font. It's been typeset by LaTeX (as suggested by its fonts and layout).
TU Wien has about 100 similar documents, that are revised from time to time. It has multiple authors, stakeholders and constraints including
- The state of Austria, via laws and legal regulations.
- The TU Wien Senate, via rules and regulations.
- Professors describing modules and courses.
- Curriculum managers providing additional information.
For example, the senate must respect Austrian law, which requires curricula to use appropriate choice of words. To me it seems likely that hundreds if not thousands of the 5,000 TU Wien staff contribute fairly directly to the authoring of at least one curriculum.
Each author of a curriculum wants to be sure that the parts of the curriculum they're responsible for are accurate, and haven't been changed during the editorial and production process.
The prospective student reading the curriculum has a different experience. For them, the author of the document is TU Wien. Further, it's a legal document. It describes the benefits provided to the student, as a result of enrolling in the programme.
For both authors and readers, consistency is a great help.
The generic example
Based on the above, and my experience of the Django web framework, I suggest creating a framework. Here's my suggestion for the use case.
Design goal
An off-the-shelf open source software system, suitably configured, will provid a satisfactory solution. This is the guiding principle for the generic example.
Configuration might require some customisation of software components, such as:
- Basic parameters
- Design parameters
- HTML templates
- CSS style sheets
- XSLT stylesheets
- TeX fonts and macros
Only necessary knowledge should be required to configure and customise the system. In other words
- The specific use case
- The high level structure of the framework
- Knowledge of the configuration component being changed.
In other words, no knowledge of HTML required to change the TeX macros, or vice versa.
People
The people constraint is based on the design goal just stated. In other words, large enough to require an custom solution and small enough to have a simple solution. The cost in time, effort and money should be low.
Here's my guess. Your's might be different.
- 1 organisation
- 50 to 1,000 authors
- Easy to learn and use for most authors.
- Web interface available.
- 20 to 500 similar documents
- Permissions governed by roles
- 1 to 5 roles for each document
- Skills available sufficient for required customisation.
Framework
This is a reformulation of the design goal.
- Provides basic facilities
- Off-the-shelf solutions for standard problems.
- Customisation by addition of components.
- Framework and components easy to configure.
- Clearly stated special skills, such as LaTeX, XML, XSLT, HTML, CSS (and of course Information Architecture).
- Separation of concerns (eg LaTeX expert doesn't need to know CSS).
Deployment
The basic idea is that deployment should be easy. And that it should rely only on widely used and understood resources.
- Store source documents in git, locally or on the cloud.
- Standard routes provided for cloud hosting.
- Compatible with running on local servers.
- Compatible with running on user PCs.
- Compatible with serverless computing (see Links below).
Outputs
As required by the organisation. Providing additional outputs should be straightforward. The core outputs should include:
- HTML
- XML
- JSON
Relevant prior art
The basic idea is to exploit structured content. One established example is XML publishing. See the Links below for three examples of how structured content is being converted to PDF.
In a certain sense the basis idea is to build a generic tool that makes it much easier (and cheaper) to solve niche cases. For that the web frameworks such as Django, Flask and Hugo (see Links below) provide examples worth studying.
These examples can provide inspiration, and a pattern to follow. More examples might be needed, as progress is made.
Acknowledgement
I'm most grateful to Gernot Salzer for telling me about the production of curricula at TU Wien, and being generous in his time and providing further information. Any errors in the above are mine.
Links
TeX Office Hour
- Interested? Do drop in to TeX Office Hour (every Thursday evening 6:30 to 7:30pm UK time).
People
TU Wien
Structured content to PDF
- The R Journal - open referreed journal for statistical computing
- Report Lab - content to PDF
- Simaril Consultants - XML, XSLT, LaTeX
Wikipedia links