This article was written in 2022, and here is an upload.
What is Rich Text?#
Rich Text refers to text that is formatted with common formatting options (such as bold and italic) as opposed to Plain Text.
Web + Rich Text Editor#
It is well known that the Web is currently the most universal platform; whether it's a computer, mobile phone, game console, car, or Kindle, as long as there is a browser, a website can be opened. With the support of technologies and tools like WebAssembly and Electron, super applications (Figma, VS Code) that perform complex operations on the Web are also emerging one after another. Faced with the temptation of "write once, run anywhere," many traditional desktop applications are gradually expanding their territory to the Web, and rich text editors are one of them.
Rich Text Editor = Pitfall?#
In the web front-end industry, rich text editors are recognized as a notorious pitfall. The advantage of web development is cross-platform, but the problem also arises from the compatibility issues brought about by cross-platform. Cross-platform means that it can run on various platforms, but how it runs is where the pitfalls lie.
In web development, editors must first handle focus, cursor selection, undo stack, parsing content pasted from external sources, etc., and then consider compatibility with different browsers (Chrome, Firefox, Safari, etc.)... After handling the most basic English input, users represented by Chinese users demand support for IME combination input... After solving IME input, RTL languages (Hebrew, Arabic) users come... Mobile users come... Collaborative editing users come...
To summarize with a quote from Zhihu:
The contradiction between backward productivity and people's growing demands.
Backward Productivity:
- Slow advancement of web-related standards
- Different implementations of the same operations or scenarios by browser vendors lead to compatibility issues
- Using HTML DOM to describe rich text content has too many uncontrollable situations
Growing Demands:
- Uncertain interaction intentions, such as pressing the Delete key, where different focus positions require different considerations
- Diversity of content input, such as typing, pasting, dragging, etc., each of which is quite complex to handle
- A large number of browser default behaviors that need to be intercepted, blocked, and proxied to ensure data integrity and correctness
- Users have increasingly high requirements for editors, such as: merging cells, multi-level nested lists, collaborative editing, version comparison, paragraph annotations; everyone considers these basic needs, but the technical difficulties involved exceed everyone's imagination.
Overview of Editor Technology Stages#
Stage | Description | Typical Products |
---|---|---|
L0 |
|
|
L1 |
|
|
L2 |
|
|
L0#
Editors in the L0 stage mainly rely on the browser's native contenteditable
API for editing and use the document.execCommand
API to perform various operations, such as bolding, linking, copying, pasting, etc.
Advantages:
- Low technical threshold. As long as the above two APIs are used, the webpage can have editing capabilities.
- Based on the browser's native editing capabilities, input is very smooth.
- No troublesome combination input issues.
Disadvantages:
- The same operation may have different implementations across different browsers.
- Outputting rich text content as HTML is not conducive to data management.
- Extending complex rich text is very difficult.
- There is no way to implement collaborative editing.
L1#
Currently, most editor frameworks are in the L1 stage, with representative examples being Quill, Slate, ProseMirror, and Draft.js. They mainly have two obvious characteristics:
- Still rely on the
contenteditable
API for content editing, but no longer rely on thedocument.execCommand
API for content operations, implementing it themselves. - Have an abstract data model to describe the content and state of the rich text editor.
2012 - Quill#
Quill is an API-driven rich text editor framework that provides an out-of-the-box editing experience.
The author of Quill, Jason Chen, is of Chinese descent. Quill is actually a side project for Jason, who initially founded a company dedicated to creating collaborative editors similar to Google Docs, which led him to write Quill for personal use.
Quill abstracts the operations of modifying the DOM Tree and data, so that in actual use, we do not need to manipulate the DOM directly but instead operate through Quill's API, with the corresponding relationships as follows:
Editor Document ====> Parchment
DOM Node ====> Blot
With this layer of abstraction, the original direct operations on the DOM become operations on Blot, and these operations are represented using Delta. Quill discards the hierarchical structure of the DOM node tree in Delta, so it is completely impossible to see the tags and node relationships that wrap the text, only a flattened array ops
.
Data Model:
{
"ops": [
{
"attributes": {
"bold": true
},
"insert": "Check"
},
{
"insert": " "
},
{
"attributes": {
"link": "https://donaldxdonald.xyz/"
},
"insert": "this"
},
{
"insert": " out ~"
}
]
}
The flattened structure of Delta is actually an implementation of the OT model in collaborative editing, so Quill was designed from the beginning for collaborative editing. The benefit of flattening is that it helps improve performance, while the downside is that it can be challenging to represent some complex nested content.
Quill's characteristics are:
- Relies on the browser's native editing capability
contentEditable
(L1) - Introduces a layer of abstract data structure to describe content and behavior
- Good support for collaborative editing
- Output structure can be either a string or Delta (JSON), but Delta as a data model has low readability
2015 - ProseMirror#
Marijn is the author of the CodeMirror editor and the acorn parser, the former of which is already used in the debugging tools built into Chrome and Firefox, while the latter is a dependency of Babel. To generate more income, Marijn started a new project, ProseMirror.
Marijn felt that none of the open-source editors on the market at that time adopted what he considered to be an ideal approach, and many were still using old paradigms for design, relying on contentEditable
for implementation. This limited developers' control over document content, which could easily be modified by users and browsers. Although ProseMirror is still based on contentEditable
for editing functionality, implementing a new selection logic from scratch would be too troublesome.
ProseMirror has a schema, so once the schema is defined, ProseMirror can automate the parser for you. The framework defines what properties and methods are needed to introduce a new Node, such as the nodeFromJSON
method for converting structure to JSON and the toDOM
method for defining how to convert structured data to DOM (somewhat similar to JSX). ProseMirror manages the changes from JSON data to DOM at this intermediate layer.
Data Model:
{
"type": "paragraph",
"content": [
{
"type": "text",
"marks": [
{
"type": "strong"
}
],
"text": "Check"
},
{
"type": "text",
"text": " "
},
{
"type": "text",
"marks": [
{
"type": "link",
"attrs": {
"href": "https://donaldxdonald.xyz/",
"title": ""
}
}
],
"text": "this"
},
{
"type": "text",
"text": " out ~"
}
]
}
ProseMirror's characteristics are:
- Relies on the browser's native editing capability
contentEditable
(L1) - More abstract JSON document model. ProseMirror only defines a configurable model framework, and the specific structure can be customized during actual development.
- Nested tree structure. Can support complex structured content.
- Good support for collaborative editing. From its inception, ProseMirror has focused on supporting collaborative editing.
- After version 1.0, it introduced immutable data, providing a complete data flow for the editor's data processing, making it stable and controllable.
2016.02 - Draft.js#
At that time, Facebook (now Meta) open-sourced Draft.js. Since they are from the same company, Draft.js used React to render the UI in the view layer. This was the first case of combining React with an editor, and the popularity of React allowed users to quickly develop based on Draft.js.
Draft.js not only looks like React but also has deep React influences internally, with state management similar to Redux in EditorState
and ContentState
, and using features like Immutable
at the data layer. The properties of JS objects can be assigned freely, meaning they are mutable. In contrast, immutable data types do not allow arbitrary assignments; each modification through the Immutable API generates a new reference.
Draft.js's characteristics are:
- Relies on the browser's native editing capability
contentEditable
(L1) - Uses React to implement the view layer
- Separates content storage and rendering logic
- Uses immutable data
- Although it also abstracts a JSON-based data model, its support for nested data is somewhat weak
2016.06 - Slate#
At this time, many editor frameworks were already in circulation, but Ian Storm Taylor, while developing his own CMS product, still felt that there was no good editor available. He believed that these editors were fine for simple products, but developing large applications like Medium, Google Docs, and Dropbox Paper was too difficult, leading to the creation of Slate.
Slate is also an editor framework rather than an out-of-the-box editor tool. As a younger framework, Slate incorporates many advantages from its predecessors, referencing immutable data and plugin mechanisms from Draft.js, and borrowing nested data structures and schema constraints from ProseMirror. By integrating many core features of editor frameworks, along with an advanced framework concept and the author's pursuit of architecture (as of 2022, it is still in beta and has not reached 1.0), Slate remains quite popular in the community.
Data Model:
{
"object": "block",
"type": "paragraph",
"nodes": [
{
"object": "text",
"text": "This is editable "
},
{
"object": "text",
"text": "rich",
"marks": [{ "type": "bold" }]
},
{
"object": "text",
"text": " text, "
},
{
"object": "text",
"text": "much",
"marks": [{ "type": "italic" }]
},
{
"object": "text",
"text": " better than a "
},
{
"object": "text",
"text": "<textarea>",
"marks": [{ "type": "code" }]
},
{
"object": "text",
"text": "!"
}
]
}
At this point, Slate's characteristics are:
- Relies on the browser's native editing capability
contentEditable
(L1) - Uses React to implement the view layer
- Supports nested JSON data structures
- Immutable data
- Plugin mechanism as the core
- Schema for constraining data
2019 - Slate 0.50+#
Slate underwent a major update in its architecture, with the author stating, "The entire framework was reconsidered from the ground up." The main updates include:
- Separating the underlying logic into Slate Core, decoupling it from the view layer
- Rewriting in TypeScript
- Simplifying the plugin mechanism, with plugins no longer coupled with rendering logic
- Replacing Immutable.js with simple JSON objects
- Streamlining its own concepts and some commands, renaming them to Transforms
Data Model:
{
"type": "paragraph",
"children": [
{ "text": "This is editable " },
{ "text": "rich", "bold": true },
{ "text": " text, " },
{ "text": "much", "italic": true },
{ "text": " better than a " },
{ "text": "<textarea>", "code": true },
{ "text": "!" }
]
}
At this point, Slate's characteristics are:
- Relies on the browser's native editing capability
contentEditable
(L1) - Very concise support for nested data models
- Overall architecture adopts pure functions + interfaces, making both the thought process and code very simple
- The plugin mechanism supports the development of powerful features
- The overall design philosophy is very similar to the DOM
2022 - Lexical#
Due to Draft.js having done a lot of dirty work in terms of browser compatibility in the past, which is now largely unnecessary, and to improve developer experience (many developers complained that Draft.js is not user-friendly), Meta open-sourced a new editor framework called Lexical, aimed at replacing Draft.js.
Currently, there doesn't seem to be any particularly innovative concepts; it mainly absorbs the advantages of other editor frameworks:
- Relies on the browser's native editing capability
contentEditable
(L1) - Retains some concepts from Draft.js (EditorState)
- Not bound to React, allowing various frameworks to implement the view layer
- The entire framework is quite lightweight, with almost no other dependencies
L2#
As mentioned earlier, with the browser's contenteditable
API, it is possible to quickly develop an editor, but there are too many compatibility issues. Although most editor frameworks are still in L1, as early as 2010, the financially strong Google had already begun to abandon the browser's contenteditable
. The new version of Google Docs is based on Canvas, developing its own cursor system, text layout system, and font parsing, so in addition to its own abstract data structure, even the editing operations are implemented by itself, resulting in consistent editing presentation effects, making collaborative editing a natural outcome.
Of course, the core technology of Google Docs is not open-sourced; such financially driven work must be firmly in their own hands. Domestic editors like Tencent Docs and WPS can also be considered L2, not necessarily using Canvas, but the idea is to implement editing layout functions independently, abandoning the browser's contenteditable
.
Conclusion#
Contenteditable is terrible, but editors have minimized their use of it. More severe is the chaotic ecosystem formed by the combination of operating systems, browsers, and input methods—an editor cannot control it, yet products expect to flourish in this ecosystem. Hence, it is said that web rich text editors are one of the pitfalls in front-end development.
Reference Links#
- Why is it said that rich text editors are pitfalls? - Zhihu
- What proportion of front-end engineers can independently develop a text editor suitable for commercial websites within a reasonable time? - Zhihu
- The evolution of open-source rich text editor technology (2020 1024)
- The dilemma and breakthrough of ContentEditable
- Slate.js - A revolutionary rich text editing framework - Juejin
- Jason Chen: Building Editors in the Browser | JSConf.ar 2014
- Marijn Haverbeke: Salvaging contentEditable: Building a Robust WYSIWYG Editor | JSConf EU 2015
- ProseMirror
- Facebook open sources Lexical, an extensible text editor library - Hacker News
- Why is ContentEditable so terrible - OSCHINA - Chinese Open Source Technology Exchange Community
- How do you view Google Docs migrating from HTML to Canvas-based rendering? - Zhihu
- Discussing the dilemmas of web rich text editors based on popular editor architectures