diff options
| author | Jonathan Corbet <corbet@lwn.net> | 2025-08-29 15:58:48 -0600 |
|---|---|---|
| committer | Jonathan Corbet <corbet@lwn.net> | 2025-08-29 15:58:48 -0600 |
| commit | c67a9f492c3c15ef1695c629bf5cc57f674a8e83 (patch) | |
| tree | 0b09fbef0138730702a0ef2c3b19d5748ca7145c | |
| parent | 61578493ca7f9d4acd804544f3f5651f5124b12f (diff) | |
| parent | aebcc3009ed55564b74378945206642a372c3e27 (diff) | |
Merge branch 'mauro' into docs-mw
Another build series from Mauro:
The goal of this series is to drop one of the most ancient and ugliest
hack from the documentation build system. Before migrating to Sphinx,
the media subsystem already had a very comprehensive uAPI book, together
with a build time system to detect and point for any documentation gaps.
When migrating to Sphinx, we ported the logic to a Perl script
(parse-headers.pl) and Markus came up with a Sphinx extension
(kernel_include.py). We also added some files to control how parse-headers
produce results, and a Makefile.
At the initial Sphinx versions (1.4.1 if I recall correctly), when
a new symbol is added to videodev2.h, a new warning were
produced at documentatiion time, it the patchset didn't have
the corresponding documentation path.
While kernel-include is generic, the only user at the moment is the media
subsystem.
This series gets rid of the Python script, replacing it by a command
line script and a class. The parse header class can optionally be used by
kernel-include to produce an enriched code that will contain cross-references.
As the other conversions, it starts with a bug-compatible version of
parse-headers, but the subsequent patches add more functionalities and
fix bugs.
It should be noticed that modern of Sphinx disabled the cross-reference
warnings. So, at the next series, I'll be re-adding it in a controlled way
(e.g. just for the references from kernel-include that has an special
argument).
The script also supports now generating a "toc" output, which will be used
at the next series.
24 files changed, 1695 insertions, 616 deletions
diff --git a/.pylintrc b/.pylintrc index 30b8ae1659f8..89eaf2100edd 100644 --- a/.pylintrc +++ b/.pylintrc @@ -1,2 +1,2 @@ [MASTER] -init-hook='import sys; sys.path += ["scripts/lib/kdoc", "scripts/lib/abi"]' +init-hook='import sys; sys.path += ["scripts/lib/kdoc", "scripts/lib/abi", "tools/docs/lib"]' diff --git a/Documentation/Makefile b/Documentation/Makefile index 2ed334971acd..5c20c68be89a 100644 --- a/Documentation/Makefile +++ b/Documentation/Makefile @@ -87,7 +87,7 @@ loop_cmd = $(echo-cmd) $(cmd_$(1)) || exit; PYTHONPYCACHEPREFIX ?= $(abspath $(BUILDDIR)/__pycache__) quiet_cmd_sphinx = SPHINX $@ --> file://$(abspath $(BUILDDIR)/$3/$4) - cmd_sphinx = $(MAKE) BUILDDIR=$(abspath $(BUILDDIR)) $(build)=Documentation/userspace-api/media $2 && \ + cmd_sphinx = \ PYTHONPYCACHEPREFIX="$(PYTHONPYCACHEPREFIX)" \ BUILDDIR=$(abspath $(BUILDDIR)) SPHINX_CONF=$(abspath $(src)/$5/$(SPHINX_CONF)) \ $(PYTHON3) $(srctree)/scripts/jobserver-exec \ @@ -171,7 +171,6 @@ refcheckdocs: cleandocs: $(Q)rm -rf $(BUILDDIR) - $(Q)$(MAKE) BUILDDIR=$(abspath $(BUILDDIR)) $(build)=Documentation/userspace-api/media clean dochelp: @echo ' Linux kernel internal documentation in different formats from ReST:' diff --git a/Documentation/sphinx/kernel_include.py b/Documentation/sphinx/kernel_include.py index 1e566e87ebcd..23566ab74866 100755 --- a/Documentation/sphinx/kernel_include.py +++ b/Documentation/sphinx/kernel_include.py @@ -1,31 +1,82 @@ #!/usr/bin/env python3 -# -*- coding: utf-8; mode: python -*- # SPDX-License-Identifier: GPL-2.0 -# pylint: disable=R0903, C0330, R0914, R0912, E0401 +# pylint: disable=R0903, R0912, R0914, R0915, C0209,W0707 + """ - kernel-include - ~~~~~~~~~~~~~~ +Implementation of the ``kernel-include`` reST-directive. + +:copyright: Copyright (C) 2016 Markus Heiser +:license: GPL Version 2, June 1991 see linux/COPYING for details. + +The ``kernel-include`` reST-directive is a replacement for the ``include`` +directive. The ``kernel-include`` directive expand environment variables in +the path name and allows to include files from arbitrary locations. + +.. hint:: + + Including files from arbitrary locations (e.g. from ``/etc``) is a + security risk for builders. This is why the ``include`` directive from + docutils *prohibit* pathnames pointing to locations *above* the filesystem + tree where the reST document with the include directive is placed. + +Substrings of the form $name or ${name} are replaced by the value of +environment variable name. Malformed variable names and references to +non-existing variables are left unchanged. + +**Supported Sphinx Include Options**: + +:param literal: + If present, the included file is inserted as a literal block. + +:param code: + Specify the language for syntax highlighting (e.g., 'c', 'python'). + +:param encoding: + Specify the encoding of the included file (default: 'utf-8'). + +:param tab-width: + Specify the number of spaces that a tab represents. + +:param start-line: + Line number at which to start including the file (1-based). + +:param end-line: + Line number at which to stop including the file (inclusive). + +:param start-after: + Include lines after the first line matching this text. + +:param end-before: + Include lines before the first line matching this text. - Implementation of the ``kernel-include`` reST-directive. +:param number-lines: + Number the included lines (integer specifies start number). + Only effective with 'literal' or 'code' options. - :copyright: Copyright (C) 2016 Markus Heiser - :license: GPL Version 2, June 1991 see linux/COPYING for details. +:param class: + Specify HTML class attribute for the included content. - The ``kernel-include`` reST-directive is a replacement for the ``include`` - directive. The ``kernel-include`` directive expand environment variables in - the path name and allows to include files from arbitrary locations. +**Kernel-specific Extensions**: - .. hint:: +:param generate-cross-refs: + If present, instead of directly including the file, it calls + ParseDataStructs() to convert C data structures into cross-references + that link to comprehensive documentation in other ReST files. - Including files from arbitrary locations (e.g. from ``/etc``) is a - security risk for builders. This is why the ``include`` directive from - docutils *prohibit* pathnames pointing to locations *above* the filesystem - tree where the reST document with the include directive is placed. +:param exception-file: + (Used with generate-cross-refs) - Substrings of the form $name or ${name} are replaced by the value of - environment variable name. Malformed variable names and references to - non-existing variables are left unchanged. + Path to a file containing rules for handling special cases: + - Ignore specific C data structures + - Use alternative reference names + - Specify different reference types + +:param warn-broken: + (Used with generate-cross-refs) + + Enables warnings when auto-generated cross-references don't point to + existing documentation targets. """ # ============================================================================== @@ -33,161 +84,341 @@ # ============================================================================== import os.path +import re +import sys from docutils import io, nodes, statemachine +from docutils.statemachine import ViewList from docutils.utils.error_reporting import SafeString, ErrorString -from docutils.parsers.rst import directives +from docutils.parsers.rst import Directive, directives from docutils.parsers.rst.directives.body import CodeBlock, NumberLines -from docutils.parsers.rst.directives.misc import Include -__version__ = '1.0' +from sphinx.util import logging -# ============================================================================== -def setup(app): -# ============================================================================== +srctree = os.path.abspath(os.environ["srctree"]) +sys.path.insert(0, os.path.join(srctree, "tools/docs/lib")) - app.add_directive("kernel-include", KernelInclude) - return dict( - version = __version__, - parallel_read_safe = True, - parallel_write_safe = True - ) +from parse_data_structs import ParseDataStructs -# ============================================================================== -class KernelInclude(Include): -# ============================================================================== +__version__ = "1.0" +logger = logging.getLogger(__name__) - """KernelInclude (``kernel-include``) directive""" +RE_DOMAIN_REF = re.compile(r'\\ :(ref|c:type|c:func):`([^<`]+)(?:<([^>]+)>)?`\\') +RE_SIMPLE_REF = re.compile(r'`([^`]+)`') - def run(self): - env = self.state.document.settings.env - path = os.path.realpath( - os.path.expandvars(self.arguments[0])) - # to get a bit security back, prohibit /etc: - if path.startswith(os.sep + "etc"): - raise self.severe( - 'Problems with "%s" directive, prohibited path: %s' - % (self.name, path)) +# ============================================================================== +class KernelInclude(Directive): + """ + KernelInclude (``kernel-include``) directive - self.arguments[0] = path + Most of the stuff here came from Include directive defined at: + docutils/parsers/rst/directives/misc.py - env.note_dependency(os.path.abspath(path)) + Yet, overriding the class don't has any benefits: the original class + only have run() and argument list. Not all of them are implemented, + when checked against latest Sphinx version, as with time more arguments + were added. - #return super(KernelInclude, self).run() # won't work, see HINTs in _run() - return self._run() + So, keep its own list of supported arguments + """ - def _run(self): - """Include a file as part of the content of this reST file.""" + required_arguments = 1 + optional_arguments = 0 + final_argument_whitespace = True + option_spec = { + 'literal': directives.flag, + 'code': directives.unchanged, + 'encoding': directives.encoding, + 'tab-width': int, + 'start-line': int, + 'end-line': int, + 'start-after': directives.unchanged_required, + 'end-before': directives.unchanged_required, + # ignored except for 'literal' or 'code': + 'number-lines': directives.unchanged, # integer or None + 'class': directives.class_option, - # HINT: I had to copy&paste the whole Include.run method. I'am not happy - # with this, but due to security reasons, the Include.run method does - # not allow absolute or relative pathnames pointing to locations *above* - # the filesystem tree where the reST document is placed. + # Arguments that aren't from Sphinx Include directive + 'generate-cross-refs': directives.flag, + 'warn-broken': directives.flag, + 'toc': directives.flag, + 'exception-file': directives.unchanged, + } - if not self.state.document.settings.file_insertion_enabled: - raise self.warning('"%s" directive disabled.' % self.name) - source = self.state_machine.input_lines.source( - self.lineno - self.state_machine.input_offset - 1) - source_dir = os.path.dirname(os.path.abspath(source)) - path = directives.path(self.arguments[0]) - if path.startswith('<') and path.endswith('>'): - path = os.path.join(self.standard_include_path, path[1:-1]) - path = os.path.normpath(os.path.join(source_dir, path)) + def read_rawtext(self, path, encoding): + """Read and process file content with error handling""" + try: + self.state.document.settings.record_dependencies.add(path) + include_file = io.FileInput(source_path=path, + encoding=encoding, + error_handler=self.state.document.settings.input_encoding_error_handler) + except UnicodeEncodeError: + raise self.severe('Problems with directive path:\n' + 'Cannot encode input file path "%s" ' + '(wrong locale?).' % SafeString(path)) + except IOError as error: + raise self.severe('Problems with directive path:\n%s.' % ErrorString(error)) - # HINT: this is the only line I had to change / commented out: - #path = utils.relative_path(None, path) + try: + return include_file.read() + except UnicodeError as error: + raise self.severe('Problem with directive:\n%s' % ErrorString(error)) - encoding = self.options.get( - 'encoding', self.state.document.settings.input_encoding) - e_handler=self.state.document.settings.input_encoding_error_handler - tab_width = self.options.get( - 'tab-width', self.state.document.settings.tab_width) - try: - self.state.document.settings.record_dependencies.add(path) - include_file = io.FileInput(source_path=path, - encoding=encoding, - error_handler=e_handler) - except UnicodeEncodeError as error: - raise self.severe('Problems with "%s" directive path:\n' - 'Cannot encode input file path "%s" ' - '(wrong locale?).' % - (self.name, SafeString(path))) - except IOError as error: - raise self.severe('Problems with "%s" directive path:\n%s.' % - (self.name, ErrorString(error))) + def apply_range(self, rawtext): + """ + Handles start-line, end-line, start-after and end-before parameters + """ + + # Get to-be-included content startline = self.options.get('start-line', None) endline = self.options.get('end-line', None) try: if startline or (endline is not None): - lines = include_file.readlines() - rawtext = ''.join(lines[startline:endline]) - else: - rawtext = include_file.read() + lines = rawtext.splitlines() + rawtext = '\n'.join(lines[startline:endline]) except UnicodeError as error: - raise self.severe('Problem with "%s" directive:\n%s' % - (self.name, ErrorString(error))) + raise self.severe(f'Problem with "{self.name}" directive:\n' + + io.error_string(error)) # start-after/end-before: no restrictions on newlines in match-text, # and no restrictions on matching inside lines vs. line boundaries - after_text = self.options.get('start-after', None) + after_text = self.options.get("start-after", None) if after_text: # skip content in rawtext before *and incl.* a matching text after_index = rawtext.find(after_text) if after_index < 0: raise self.severe('Problem with "start-after" option of "%s" ' - 'directive:\nText not found.' % self.name) - rawtext = rawtext[after_index + len(after_text):] - before_text = self.options.get('end-before', None) + "directive:\nText not found." % self.name) + rawtext = rawtext[after_index + len(after_text) :] + before_text = self.options.get("end-before", None) if before_text: # skip content in rawtext after *and incl.* a matching text before_index = rawtext.find(before_text) if before_index < 0: raise self.severe('Problem with "end-before" option of "%s" ' - 'directive:\nText not found.' % self.name) + "directive:\nText not found." % self.name) rawtext = rawtext[:before_index] + return rawtext + + def xref_text(self, env, path, tab_width): + """ + Read and add contents from a C file parsed to have cross references. + + There are two types of supported output here: + - A C source code with cross-references; + - a TOC table containing cross references. + """ + parser = ParseDataStructs() + parser.parse_file(path) + + if 'exception-file' in self.options: + source_dir = os.path.dirname(os.path.abspath( + self.state_machine.input_lines.source( + self.lineno - self.state_machine.input_offset - 1))) + exceptions_file = os.path.join(source_dir, self.options['exception-file']) + parser.process_exceptions(exceptions_file) + + # Store references on a symbol dict to be used at check time + if 'warn-broken' in self.options: + env._xref_files.add(path) + + if "toc" not in self.options: + + rawtext = ".. parsed-literal::\n\n" + parser.gen_output() + self.apply_range(rawtext) + + include_lines = statemachine.string2lines(rawtext, tab_width, + convert_whitespace=True) + + # Sphinx always blame the ".. <directive>", so placing + # line numbers here won't make any difference + + self.state_machine.insert_input(include_lines, path) + return [] + + # TOC output is a ReST file, not a literal. So, we can add line + # numbers + + rawtext = parser.gen_toc() + include_lines = statemachine.string2lines(rawtext, tab_width, convert_whitespace=True) - if 'literal' in self.options: - # Convert tabs to spaces, if `tab_width` is positive. - if tab_width >= 0: - text = rawtext.expandtabs(tab_width) - else: - text = rawtext - literal_block = nodes.literal_block(rawtext, source=path, - classes=self.options.get('class', [])) - literal_block.line = 1 - self.add_name(literal_block) - if 'number-lines' in self.options: - try: - startline = int(self.options['number-lines'] or 1) - except ValueError: - raise self.error(':number-lines: with non-integer ' - 'start value') - endline = startline + len(include_lines) - if text.endswith('\n'): - text = text[:-1] - tokens = NumberLines([([], text)], startline, endline) - for classes, value in tokens: - if classes: - literal_block += nodes.inline(value, value, - classes=classes) - else: - literal_block += nodes.Text(value, value) - else: - literal_block += nodes.Text(text, text) - return [literal_block] - if 'code' in self.options: - self.options['source'] = path - codeblock = CodeBlock(self.name, - [self.options.pop('code')], # arguments - self.options, - include_lines, # content - self.lineno, - self.content_offset, - self.block_text, - self.state, - self.state_machine) - return codeblock.run() - self.state_machine.insert_input(include_lines, path) + + # Append line numbers data + + startline = self.options.get('start-line', None) + + result = ViewList() + if startline and startline > 0: + offset = startline - 1 + else: + offset = 0 + + for ln, line in enumerate(include_lines, start=offset): + result.append(line, path, ln) + + self.state_machine.insert_input(result, path) + return [] + + def literal(self, path, tab_width, rawtext): + """Output a literal block""" + + # Convert tabs to spaces, if `tab_width` is positive. + if tab_width >= 0: + text = rawtext.expandtabs(tab_width) + else: + text = rawtext + literal_block = nodes.literal_block(rawtext, source=path, + classes=self.options.get("class", [])) + literal_block.line = 1 + self.add_name(literal_block) + if "number-lines" in self.options: + try: + startline = int(self.options["number-lines"] or 1) + except ValueError: + raise self.error(":number-lines: with non-integer start value") + endline = startline + len(include_lines) + if text.endswith("\n"): + text = text[:-1] + tokens = NumberLines([([], text)], startline, endline) + for classes, value in tokens: + if classes: + literal_block += nodes.inline(value, value, + classes=classes) + else: + literal_block += nodes.Text(value, value) + else: + literal_block += nodes.Text(text, text) + return [literal_block] + + def code(self, path, tab_width): + """Output a code block""" + + include_lines = statemachine.string2lines(rawtext, tab_width, + convert_whitespace=True) + + self.options["source"] = path + codeblock = CodeBlock(self.name, + [self.options.pop("code")], # arguments + self.options, + include_lines, + self.lineno, + self.content_offset, + self.block_text, + self.state, + self.state_machine) + return codeblock.run() + + def run(self): + """Include a file as part of the content of this reST file.""" + env = self.state.document.settings.env + path = os.path.realpath(os.path.expandvars(self.arguments[0])) + + # to get a bit security back, prohibit /etc: + if path.startswith(os.sep + "etc"): + raise self.severe('Problems with "%s" directive, prohibited path: %s' % + (self.name, path)) + + self.arguments[0] = path + + env.note_dependency(os.path.abspath(path)) + + # HINT: I had to copy&paste the whole Include.run method. I'am not happy + # with this, but due to security reasons, the Include.run method does + # not allow absolute or relative pathnames pointing to locations *above* + # the filesystem tree where the reST document is placed. + + if not self.state.document.settings.file_insertion_enabled: + raise self.warning('"%s" directive disabled.' % self.name) + source = self.state_machine.input_lines.source(self.lineno - + self.state_machine.input_offset - 1) + source_dir = os.path.dirname(os.path.abspath(source)) + path = directives.path(self.arguments[0]) + if path.startswith("<") and path.endswith(">"): + path = os.path.join(self.standard_include_path, path[1:-1]) + path = os.path.normpath(os.path.join(source_dir, path)) + + # HINT: this is the only line I had to change / commented out: + # path = utils.relative_path(None, path) + + encoding = self.options.get("encoding", + self.state.document.settings.input_encoding) + tab_width = self.options.get("tab-width", + self.state.document.settings.tab_width) + + # Get optional arguments to related to cross-references generation + if "generate-cross-refs" in self.options: + return self.xref_text(env, path, tab_width) + + rawtext = self.read_rawtext(path, encoding) + rawtext = self.apply_range(rawtext) + + if "code" in self.options: + return self.code(path, tab_width, rawtext) + + return self.literal(path, tab_width, rawtext) + +# ============================================================================== + +reported = set() + +def check_missing_refs(app, env, node, contnode): + """Check broken refs for the files it creates xrefs""" + if not node.source: + return None + + try: + xref_files = env._xref_files + except AttributeError: + logger.critical("FATAL: _xref_files not initialized!") + raise + + # Only show missing references for kernel-include reference-parsed files + if node.source not in xref_files: + return None + + target = node.get('reftarget', '') + domain = node.get('refdomain', 'std') + reftype = node.get('reftype', '') + + msg = f"can't link to: {domain}:{reftype}:: {target}" + + # Don't duplicate warnings + data = (node.source, msg) + if data in reported: + return None + reported.add(data) + + logger.warning(msg, location=node, type='ref', subtype='missing') + + return None + +def merge_xref_info(app, env, docnames, other): + """ + As each process modify env._xref_files, we need to merge them back. + """ + if not hasattr(other, "_xref_files"): + return + env._xref_files.update(getattr(other, "_xref_files", set())) + +def init_xref_docs(app, env, docnames): + """Initialize a list of files that we're generating cross references¨""" + app.env._xref_files = set() + +# ============================================================================== + +def setup(app): + """Setup Sphinx exension""" + + app.connect("env-before-read-docs", init_xref_docs) + app.connect("env-merge-info", merge_xref_info) + app.add_directive("kernel-include", KernelInclude) + app.connect("missing-reference", check_missing_refs) + + return { + "version": __version__, + "parallel_read_safe": True, + "parallel_write_safe": True, + } diff --git a/Documentation/sphinx/parse-headers.pl b/Documentation/sphinx/parse-headers.pl deleted file mode 100755 index 7b1458544e2e..000000000000 --- a/Documentation/sphinx/parse-headers.pl +++ /dev/null @@ -1,404 +0,0 @@ -#!/usr/bin/env perl -# SPDX-License-Identifier: GPL-2.0 -# Copyright (c) 2016 by Mauro Carvalho Chehab <mchehab@kernel.org>. - -use strict; -use Text::Tabs; -use Getopt::Long; -use Pod::Usage; - -my $debug; -my $help; -my $man; - -GetOptions( - "debug" => \$debug, - 'usage|?' => \$help, - 'help' => \$man -) or pod2usage(2); - -pod2usage(1) if $help; -pod2usage(-exitstatus => 0, -verbose => 2) if $man; -pod2usage(2) if (scalar @ARGV < 2 || scalar @ARGV > 3); - -my ($file_in, $file_out, $file_exceptions) = @ARGV; - -my $data; -my %ioctls; -my %defines; -my %typedefs; -my %enums; -my %enum_symbols; -my %structs; - -require Data::Dumper if ($debug); - -# -# read the file and get identifiers -# - -my $is_enum = 0; -my $is_comment = 0; -open IN, $file_in or die "Can't open $file_in"; -while (<IN>) { - $data .= $_; - - my $ln = $_; - if (!$is_comment) { - $ln =~ s,/\*.*(\*/),,g; - - $is_comment = 1 if ($ln =~ s,/\*.*,,); - } else { - if ($ln =~ s,^(.*\*/),,) { - $is_comment = 0; - } else { - next; - } - } - - if ($is_enum && $ln =~ m/^\s*([_\w][\w\d_]+)\s*[\,=]?/) { - my $s = $1; - my $n = $1; - $n =~ tr/A-Z/a-z/; - $n =~ tr/_/-/; - - $enum_symbols{$s} = "\\ :ref:`$s <$n>`\\ "; - - $is_enum = 0 if ($is_enum && m/\}/); - next; - } - $is_enum = 0 if ($is_enum && m/\}/); - - if ($ln =~ m/^\s*#\s*define\s+([_\w][\w\d_]+)\s+_IO/) { - my $s = $1; - my $n = $1; - $n =~ tr/A-Z/a-z/; - - $ioctls{$s} = "\\ :ref:`$s <$n>`\\ "; - next; - } - - if ($ln =~ m/^\s*#\s*define\s+([_\w][\w\d_]+)\s+/) { - my $s = $1; - my $n = $1; - $n =~ tr/A-Z/a-z/; - $n =~ tr/_/-/; - - $defines{$s} = "\\ :ref:`$s <$n>`\\ "; - next; - } - - if ($ln =~ m/^\s*typedef\s+([_\w][\w\d_]+)\s+(.*)\s+([_\w][\w\d_]+);/) { - my $s = $2; - my $n = $3; - - $typedefs{$n} = "\\ :c:type:`$n <$s>`\\ "; - next; - } - if ($ln =~ m/^\s*enum\s+([_\w][\w\d_]+)\s+\{/ - || $ln =~ m/^\s*enum\s+([_\w][\w\d_]+)$/ - || $ln =~ m/^\s*typedef\s*enum\s+([_\w][\w\d_]+)\s+\{/ - || $ln =~ m/^\s*typedef\s*enum\s+([_\w][\w\d_]+)$/) { - my $s = $1; - - $enums{$s} = "enum :c:type:`$s`\\ "; - - $is_enum = $1; - next; - } - if ($ln =~ m/^\s*struct\s+([_\w][\w\d_]+)\s+\{/ - || $ln =~ m/^\s*struct\s+([[_\w][\w\d_]+)$/ - || $ln =~ m/^\s*typedef\s*struct\s+([_\w][\w\d_]+)\s+\{/ - || $ln =~ m/^\s*typedef\s*struct\s+([[_\w][\w\d_]+)$/ - ) { - my $s = $1; - - $structs{$s} = "struct $s\\ "; - next; - } -} -close IN; - -# -# Handle multi-line typedefs -# - -my @matches = ($data =~ m/typedef\s+struct\s+\S+?\s*\{[^\}]+\}\s*(\S+)\s*\;/g, - $data =~ m/typedef\s+enum\s+\S+?\s*\{[^\}]+\}\s*(\S+)\s*\;/g,); -foreach my $m (@matches) { - my $s = $m; - - $typedefs{$s} = "\\ :c:type:`$s`\\ "; - next; -} - -# -# Handle exceptions, if any -# - -my %def_reftype = ( - "ioctl" => ":ref", - "define" => ":ref", - "symbol" => ":ref", - "typedef" => ":c:type", - "enum" => ":c:type", - "struct" => ":c:type", -); - -if ($file_exceptions) { - open IN, $file_exceptions or die "Can't read $file_exceptions"; - while (<IN>) { - next if (m/^\s*$/ || m/^\s*#/); - - # Parsers to ignore a symbol - - if (m/^ignore\s+ioctl\s+(\S+)/) { - delete $ioctls{$1} if (exists($ioctls{$1})); - next; - } - if (m/^ignore\s+define\s+(\S+)/) { - delete $defines{$1} if (exists($defines{$1})); - next; - } - if (m/^ignore\s+typedef\s+(\S+)/) { - delete $typedefs{$1} if (exists($typedefs{$1})); - next; - } - if (m/^ignore\s+enum\s+(\S+)/) { - delete $enums{$1} if (exists($enums{$1})); - next; - } - if (m/^ignore\s+struct\s+(\S+)/) { - delete $structs{$1} if (exists($structs{$1})); - next; - } - if (m/^ignore\s+symbol\s+(\S+)/) { - delete $enum_symbols{$1} if (exists($enum_symbols{$1})); - next; - } - - # Parsers to replace a symbol - my ($type, $old, $new, $reftype); - - if (m/^replace\s+(\S+)\s+(\S+)\s+(\S+)/) { - $type = $1; - $old = $2; - $new = $3; - } else { - die "Can't parse $file_exceptions: $_"; - } - - if ($new =~ m/^\:c\:(data|func|macro|type)\:\`(.+)\`/) { - $reftype = ":c:$1"; - $new = $2; - } elsif ($new =~ m/\:ref\:\`(.+)\`/) { - $reftype = ":ref"; - $new = $1; - } else { - $reftype = $def_reftype{$type}; - } - $new = "$reftype:`$old <$new>`"; - - if ($type eq "ioctl") { - $ioctls{$old} = $new if (exists($ioctls{$old})); - next; - } - if ($type eq "define") { - $defines{$old} = $new if (exists($defines{$old})); - next; - } - if ($type eq "symbol") { - $enum_symbols{$old} = $new if (exists($enum_symbols{$old})); - next; - } - if ($type eq "typedef") { - $typedefs{$old} = $new if (exists($typedefs{$old})); - next; - } - if ($type eq "enum") { - $enums{$old} = $new if (exists($enums{$old})); - next; - } - if ($type eq "struct") { - $structs{$old} = $new if (exists($structs{$old})); - next; - } - - die "Can't parse $file_exceptions: $_"; - } -} - -if ($debug) { - print Data::Dumper->Dump([\%ioctls], [qw(*ioctls)]) if (%ioctls); - print Data::Dumper->Dump([\%typedefs], [qw(*typedefs)]) if (%typedefs); - print Data::Dumper->Dump([\%enums], [qw(*enums)]) if (%enums); - print Data::Dumper->Dump([\%structs], [qw(*structs)]) if (%structs); - print Data::Dumper->Dump([\%defines], [qw(*defines)]) if (%defines); - print Data::Dumper->Dump([\%enum_symbols], [qw(*enum_symbols)]) if (%enum_symbols); -} - -# -# Align block -# -$data = expand($data); -$data = " " . $data; -$data =~ s/\n/\n /g; -$data =~ s/\n\s+$/\n/g; -$data =~ s/\n\s+\n/\n\n/g; - -# -# Add escape codes for special characters -# -$data =~ s,([\_\`\*\<\>\&\\\\:\/\|\%\$\#\{\}\~\^]),\\$1,g; - -$data =~ s,DEPRECATED,**DEPRECATED**,g; - -# -# Add references -# - -my $start_delim = "[ \n\t\(\=\*\@]"; -my $end_delim = "(\\s|,|\\\\=|\\\\:|\\;|\\\)|\\}|\\{)"; - -foreach my $r (keys %ioctls) { - my $s = $ioctls{$r}; - - $r =~ s,([\_\`\*\<\>\&\\\\:\/]),\\\\$1,g; - - print "$r -> $s\n" if ($debug); - - $data =~ s/($start_delim)($r)$end_delim/$1$s$3/g; -} - -foreach my $r (keys %defines) { - my $s = $defines{$r}; - - $r =~ s,([\_\`\*\<\>\&\\\\:\/]),\\\\$1,g; - - print "$r -> $s\n" if ($debug); - - $data =~ s/($start_delim)($r)$end_delim/$1$s$3/g; -} - -foreach my $r (keys %enum_symbols) { - my $s = $enum_symbols{$r}; - - $r =~ s,([\_\`\*\<\>\&\\\\:\/]),\\\\$1,g; - - print "$r -> $s\n" if ($debug); - - $data =~ s/($start_delim)($r)$end_delim/$1$s$3/g; -} - -foreach my $r (keys %enums) { - my $s = $enums{$r}; - - $r =~ s,([\_\`\*\<\>\&\\\\:\/]),\\\\$1,g; - - print "$r -> $s\n" if ($debug); - - $data =~ s/enum\s+($r)$end_delim/$s$2/g; -} - -foreach my $r (keys %structs) { - my $s = $structs{$r}; - - $r =~ s,([\_\`\*\<\>\&\\\\:\/]),\\\\$1,g; - - print "$r -> $s\n" if ($debug); - - $data =~ s/struct\s+($r)$end_delim/$s$2/g; -} - -foreach my $r (keys %typedefs) { - my $s = $typedefs{$r}; - - $r =~ s,([\_\`\*\<\>\&\\\\:\/]),\\\\$1,g; - - print "$r -> $s\n" if ($debug); - $data =~ s/($start_delim)($r)$end_delim/$1$s$3/g; -} - -$data =~ s/\\ ([\n\s])/\1/g; - -# -# Generate output file -# - -my $title = $file_in; -$title =~ s,.*/,,; - -open OUT, "> $file_out" or die "Can't open $file_out"; -print OUT ".. -*- coding: utf-8; mode: rst -*-\n\n"; -print OUT "$title\n"; -print OUT "=" x length($title); -print OUT "\n\n.. parsed-literal::\n\n"; -print OUT $data; -close OUT; - -__END__ - -=head1 NAME - -parse_headers.pl - parse a C file, in order to identify functions, structs, -enums and defines and create cross-references to a Sphinx book. - -=head1 SYNOPSIS - -B<parse_headers.pl> [<options>] <C_FILE> <OUT_FILE> [<EXCEPTIONS_FILE>] - -Where <options> can be: --debug, --help or --usage. - -=head1 OPTIONS - -=over 8 - -=item B<--debug> - -Put the script in verbose mode, useful for debugging. - -=item B<--usage> - -Prints a brief help message and exits. - -=item B<--help> - -Prints a more detailed help message and exits. - -=back - -=head1 DESCRIPTION - -Convert a C header or source file (C_FILE), into a ReStructured Text -included via ..parsed-literal block with cross-references for the -documentation files that describe the API. It accepts an optional -EXCEPTIONS_FILE with describes what elements will be either ignored or -be pointed to a non-default reference. - -The output is written at the (OUT_FILE). - -It is capable of identifying defines, functions, structs, typedefs, -enums and enum symbols and create cross-references for all of them. -It is also capable of distinguish #define used for specifying a Linux -ioctl. - -The EXCEPTIONS_FILE contain two rules to allow ignoring a symbol or -to replace the default references by a custom one. - -Please read Documentation/doc-guide/parse-headers.rst at the Kernel's -tree for more details. - -=head1 BUGS - -Report bugs to Mauro Carvalho Chehab <mchehab@kernel.org> - -=head1 COPYRIGHT - -Copyright (c) 2016 by Mauro Carvalho Chehab <mchehab@kernel.org>. - -License GPLv2: GNU GPL version 2 <https://gnu.org/licenses/gpl.html>. - -This is free software: you are free to change and redistribute it. -There is NO WARRANTY, to the extent permitted by law. - -=cut diff --git a/Documentation/userspace-api/media/Makefile b/Documentation/userspace-api/media/Makefile deleted file mode 100644 index 3d8aaf5c253b..000000000000 --- a/Documentation/userspace-api/media/Makefile +++ /dev/null @@ -1,64 +0,0 @@ -# SPDX-License-Identifier: GPL-2.0 - -# Rules to convert a .h file to inline RST documentation - -SRC_DIR=$(srctree)/Documentation/userspace-api/media -PARSER = $(srctree)/Documentation/sphinx/parse-headers.pl -UAPI = $(srctree)/include/uapi/linux -KAPI = $(srctree)/include/linux - -FILES = ca.h.rst dmx.h.rst frontend.h.rst net.h.rst \ - videodev2.h.rst media.h.rst cec.h.rst lirc.h.rst - -TARGETS := $(addprefix $(BUILDDIR)/, $(FILES)) - -gen_rst = \ - echo ${PARSER} $< $@ $(SRC_DIR)/$(notdir $@).exceptions; \ - ${PARSER} $< $@ $(SRC_DIR)/$(notdir $@).exceptions - -quiet_gen_rst = echo ' PARSE $(patsubst $(srctree)/%,%,$<)'; \ - ${PARSER} $< $@ $(SRC_DIR)/$(notdir $@).exceptions - -silent_gen_rst = ${gen_rst} - -$(BUILDDIR)/ca.h.rst: ${UAPI}/dvb/ca.h ${PARSER} $(SRC_DIR)/ca.h.rst.exceptions - @$($(quiet)gen_rst) - -$(BUILDDIR)/dmx.h.rst: ${UAPI}/dvb/dmx.h ${PARSER} $(SRC_DIR)/dmx.h.rst.exceptions - @$($(quiet)gen_rst) - -$(BUILDDIR)/frontend.h.rst: ${UAPI}/dvb/frontend.h ${PARSER} $(SRC_DIR)/frontend.h.rst.exceptions - @$($(quiet)gen_rst) - -$(BUILDDIR)/net.h.rst: ${UAPI}/dvb/net.h ${PARSER} $(SRC_DIR)/net.h.rst.exceptions - @$($(quiet)gen_rst) - -$(BUILDDIR)/videodev2.h.rst: ${UAPI}/videodev2.h ${PARSER} $(SRC_DIR)/videodev2.h.rst.exceptions - @$($(quiet)gen_rst) - -$(BUILDDIR)/media.h.rst: ${UAPI}/media.h ${PARSER} $(SRC_DIR)/media.h.rst.exceptions - @$($(quiet)gen_rst) - -$(BUILDDIR)/cec.h.rst: ${UAPI}/cec.h ${PARSER} $(SRC_DIR)/cec.h.rst.exceptions - @$($(quiet)gen_rst) - -$(BUILDDIR)/lirc.h.rst: ${UAPI}/lirc.h ${PARSER} $(SRC_DIR)/lirc.h.rst.exceptions - @$($(quiet)gen_rst) - -# Media build rules - -.PHONY: all html texinfo epub xml latex - -all: $(IMGDOT) $(BUILDDIR) ${TARGETS} -html: all -texinfo: all -epub: all -xml: all -latex: $(IMGPDF) all -linkcheck: - -clean: - -rm -f $(DOTTGT) $(IMGTGT) ${TARGETS} 2>/dev/null - -$(BUILDDIR): - $(Q)mkdir -p $@ diff --git a/Documentation/userspace-api/media/cec/cec-header.rst b/Documentation/userspace-api/media/cec/cec-header.rst index d70736ac2b1d..f67003bb8740 100644 --- a/Documentation/userspace-api/media/cec/cec-header.rst +++ b/Documentation/userspace-api/media/cec/cec-header.rst @@ -6,5 +6,6 @@ CEC Header File *************** -.. kernel-include:: $BUILDDIR/cec.h.rst - +.. kernel-include:: include/uapi/linux/cec.h + :generate-cross-refs: + :exception-file: cec.h.rst.exceptions diff --git a/Documentation/userspace-api/media/cec.h.rst.exceptions b/Documentation/userspace-api/media/cec/cec.h.rst.exceptions index 15fa1752d4ef..15fa1752d4ef 100644 --- a/Documentation/userspace-api/media/cec.h.rst.exceptions +++ b/Documentation/userspace-api/media/cec/cec.h.rst.exceptions diff --git a/Documentation/userspace-api/media/ca.h.rst.exceptions b/Documentation/userspace-api/media/dvb/ca.h.rst.exceptions index f6828238eb48..f6828238eb48 100644 --- a/Documentation/userspace-api/media/ca.h.rst.exceptions +++ b/Documentation/userspace-api/media/dvb/ca.h.rst.exceptions diff --git a/Documentation/userspace-api/media/dmx.h.rst.exceptions b/Documentation/userspace-api/media/dvb/dmx.h.rst.exceptions index afc14d384b83..afc14d384b83 100644 --- a/Documentation/userspace-api/media/dmx.h.rst.exceptions +++ b/Documentation/userspace-api/media/dvb/dmx.h.rst.exceptions diff --git a/Documentation/userspace-api/media/frontend.h.rst.exceptions b/Documentation/userspace-api/media/dvb/frontend.h.rst.exceptions index dcaf5740de7e..dcaf5740de7e 100644 --- a/Documentation/userspace-api/media/frontend.h.rst.exceptions +++ b/Documentation/userspace-api/media/dvb/frontend.h.rst.exceptions diff --git a/Documentation/userspace-api/media/dvb/headers.rst b/Documentation/userspace-api/media/dvb/headers.rst index 88c3eb33a89e..c75f64cf21d5 100644 --- a/Documentation/userspace-api/media/dvb/headers.rst +++ b/Documentation/userspace-api/media/dvb/headers.rst @@ -7,10 +7,19 @@ Digital TV uAPI header files Digital TV uAPI headers *********************** -.. kernel-include:: $BUILDDIR/frontend.h.rst +.. kernel-include:: include/uapi/linux/dvb/frontend.h + :generate-cross-refs: + :exception-file: frontend.h.rst.exceptions -.. kernel-include:: $BUILDDIR/dmx.h.rst +.. kernel-include:: include/uapi/linux/dvb/dmx.h + :generate-cross-refs: + :exception-file: dmx.h.rst.exceptions -.. kernel-include:: $BUILDDIR/ca.h.rst +.. kernel-include:: include/uapi/linux/dvb/ca.h + :generate-cross-refs: + :exception-file: ca.h.rst.exceptions + +.. kernel-include:: include/uapi/linux/dvb/net.h + :generate-cross-refs: + :exception-file: net.h.rst.exceptions -.. kernel-include:: $BUILDDIR/net.h.rst diff --git a/Documentation/userspace-api/media/net.h.rst.exceptions b/Documentation/userspace-api/media/dvb/net.h.rst.exceptions index 5159aa4bbbb9..5159aa4bbbb9 100644 --- a/Documentation/userspace-api/media/net.h.rst.exceptions +++ b/Documentation/userspace-api/media/dvb/net.h.rst.exceptions diff --git a/Documentation/userspace-api/media/mediactl/media-header.rst b/Documentation/userspace-api/media/mediactl/media-header.rst index c674271c93f5..d561d2845f3d 100644 --- a/Documentation/userspace-api/media/mediactl/media-header.rst +++ b/Documentation/userspace-api/media/mediactl/media-header.rst @@ -6,5 +6,6 @@ Media Controller Header File **************************** -.. kernel-include:: $BUILDDIR/media.h.rst - +.. kernel-include:: include/uapi/linux/media.h + :generate-cross-refs: + :exception-file: media.h.rst.exceptions diff --git a/Documentation/userspace-api/media/media.h.rst.exceptions b/Documentation/userspace-api/media/mediactl/media.h.rst.exceptions index 9b4c26502d95..9b4c26502d95 100644 --- a/Documentation/userspace-api/media/media.h.rst.exceptions +++ b/Documentation/userspace-api/media/mediactl/media.h.rst.exceptions diff --git a/Documentation/userspace-api/media/rc/lirc-header.rst b/Documentation/userspace-api/media/rc/lirc-header.rst index 54cb40b8a065..a53328327847 100644 --- a/Documentation/userspace-api/media/rc/lirc-header.rst +++ b/Documentation/userspace-api/media/rc/lirc-header.rst @@ -6,5 +6,7 @@ LIRC Header File **************** -.. kernel-include:: $BUILDDIR/lirc.h.rst +.. kernel-include:: include/uapi/linux/lirc.h + :generate-cross-refs: + :exception-file: lirc.h.rst.exceptions diff --git a/Documentation/userspace-api/media/lirc.h.rst.exceptions b/Documentation/userspace-api/media/rc/lirc.h.rst.exceptions index 1aeb7d7afe13..1aeb7d7afe13 100644 --- a/Documentation/userspace-api/media/lirc.h.rst.exceptions +++ b/Documentation/userspace-api/media/rc/lirc.h.rst.exceptions diff --git a/Documentation/userspace-api/media/v4l/videodev.rst b/Documentation/userspace-api/media/v4l/videodev.rst index c866fec417eb..cde485bc9a5f 100644 --- a/Documentation/userspace-api/media/v4l/videodev.rst +++ b/Documentation/userspace-api/media/v4l/videodev.rst @@ -6,4 +6,6 @@ Video For Linux Two Header File ******************************* -.. kernel-include:: $BUILDDIR/videodev2.h.rst +.. kernel-include:: include/uapi/linux/videodev2.h + :generate-cross-refs: + :exception-file: videodev2.h.rst.exceptions diff --git a/Documentation/userspace-api/media/videodev2.h.rst.exceptions b/Documentation/userspace-api/media/v4l/videodev2.h.rst.exceptions index 35d3456cc812..35d3456cc812 100644 --- a/Documentation/userspace-api/media/videodev2.h.rst.exceptions +++ b/Documentation/userspace-api/media/v4l/videodev2.h.rst.exceptions diff --git a/MAINTAINERS b/MAINTAINERS index dafc11712544..ef87548b8f88 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -7308,6 +7308,7 @@ F: scripts/get_abi.py F: scripts/kernel-doc* F: scripts/lib/abi/* F: scripts/lib/kdoc/* +F: tools/docs/* F: tools/net/ynl/pyynl/lib/doc_generator.py F: scripts/sphinx-pre-install X: Documentation/ABI/ diff --git a/scripts/sphinx-build-wrapper b/scripts/sphinx-build-wrapper new file mode 100755 index 000000000000..abe8c26ae137 --- /dev/null +++ b/scripts/sphinx-build-wrapper @@ -0,0 +1,719 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 +# Copyright (C) 2025 Mauro Carvalho Chehab <mchehab+huawei@kernel.org> +# +# pylint: disable=R0902, R0912, R0913, R0914, R0915, R0917, C0103 +# +# Converted from docs Makefile and parallel-wrapper.sh, both under +# GPLv2, copyrighted since 2008 by the following authors: +# +# Akira Yokosawa <akiyks@gmail.com> +# Arnd Bergmann <arnd@arndb.de> +# Breno Leitao <leitao@debian.org> +# Carlos Bilbao <carlos.bilbao@amd.com> +# Dave Young <dyoung@redhat.com> +# Donald Hunter <donald.hunter@gmail.com> +# Geert Uytterhoeven <geert+renesas@glider.be> +# Jani Nikula <jani.nikula@intel.com> +# Jan Stancek <jstancek@redhat.com> +# Jonathan Corbet <corbet@lwn.net> +# Joshua Clayton <stillcompiling@gmail.com> +# Kees Cook <keescook@chromium.org> +# Linus Torvalds <torvalds@linux-foundation.org> +# Magnus Damm <damm+renesas@opensource.se> +# Masahiro Yamada <masahiroy@kernel.org> +# Mauro Carvalho Chehab <mchehab+huawei@kernel.org> +# Maxim Cournoyer <maxim.cournoyer@gmail.com> +# Peter Foley <pefoley2@pefoley.com> +# Randy Dunlap <rdunlap@infradead.org> +# Rob Herring <robh@kernel.org> +# Shuah Khan <shuahkh@osg.samsung.com> +# Thorsten Blum <thorsten.blum@toblux.com> +# Tomas Winkler <tomas.winkler@intel.com> + + +""" +Sphinx build wrapper that handles Kernel-specific business rules: + +- it gets the Kernel build environment vars; +- it determines what's the best parallelism; +- it handles SPHINXDIRS + +This tool ensures that MIN_PYTHON_VERSION is satisfied. If version is +below that, it seeks for a new Python version. If found, it re-runs using +the newer version. +""" + +import argparse +import locale +import os +import re +import shlex +import shutil +import subprocess +import sys + +from concurrent import futures +from glob import glob + +LIB_DIR = "lib" +SRC_DIR = os.path.dirname(os.path.realpath(__file__)) + +sys.path.insert(0, os.path.join(SRC_DIR, LIB_DIR)) + +from jobserver import JobserverExec # pylint: disable=C0413 + + +def parse_version(version): + """Convert a major.minor.patch version into a tuple""" + return tuple(int(x) for x in version.split(".")) + +def ver_str(version): + """Returns a version tuple as major.minor.patch""" + + return ".".join([str(x) for x in version]) + +# Minimal supported Python version needed by Sphinx and its extensions +MIN_PYTHON_VERSION = parse_version("3.7") + +# Default value for --venv parameter +VENV_DEFAULT = "sphinx_latest" + +# List of make targets and its corresponding builder and output directory +TARGETS = { + "cleandocs": { + "builder": "clean", + }, + "htmldocs": { + "builder": "html", + }, + "epubdocs": { + "builder": "epub", + "out_dir": "epub", + }, + "texinfodocs": { + "builder": "texinfo", + "out_dir": "texinfo", + }, + "infodocs": { + "builder": "texinfo", + "out_dir": "texinfo", + }, + "latexdocs": { + "builder": "latex", + "out_dir": "latex", + }, + "pdfdocs": { + "builder": "latex", + "out_dir": "latex", + }, + "xmldocs": { + "builder": "xml", + "out_dir": "xml", + }, + "linkcheckdocs": { + "builder": "linkcheck" + }, +} + +# Paper sizes. An empty value will pick the default +PAPER = ["", "a4", "letter"] + +class SphinxBuilder: + """ + Handles a sphinx-build target, adding needed arguments to build + with the Kernel. + """ + + def is_rust_enabled(self): + """Check if rust is enabled at .config""" + config_path = os.path.join(self.srctree, ".config") + if os.path.isfile(config_path): + with open(config_path, "r", encoding="utf-8") as f: + return "CONFIG_RUST=y" in f.read() + return False + + def get_path(self, path, abs_path=False): + """ + Ancillary routine to handle patches the right way, as shell does. + + It first expands "~" and "~user". Then, if patch is not absolute, + join self.srctree. Finally, if requested, convert to abspath. + """ + + path = os.path.expanduser(path) + if not path.startswith("/"): + path = os.path.join(self.srctree, path) + + if abs_path: + return os.path.abspath(path) + + return path + + def __init__(self, venv=None, verbose=False, n_jobs=None, interactive=None): + """Initialize internal variables""" + self.venv = venv + self.verbose = None + + # Normal variables passed from Kernel's makefile + self.kernelversion = os.environ.get("KERNELVERSION", "unknown") + self.kernelrelease = os.environ.get("KERNELRELEASE", "unknown") + self.pdflatex = os.environ.get("PDFLATEX", "xelatex") + + if not interactive: + self.latexopts = os.environ.get("LATEXOPTS", "-interaction=batchmode -no-shell-escape") + else: + self.latexopts = os.environ.get("LATEXOPTS", "") + + if not verbose: + verbose = bool(os.environ.get("KBUILD_VERBOSE", "") != "") + + # Handle SPHINXOPTS evironment + sphinxopts = shlex.split(os.environ.get("SPHINXOPTS", "")) + + # As we handle number of jobs and quiet in separate, we need to pick + # it the same way as sphinx-build would pick, so let's use argparse + # do to the right argument expansion + parser = argparse.ArgumentParser() + parser.add_argument('-j', '--jobs', type=int) + parser.add_argument('-q', '--quiet', type=int) + + # Other sphinx-build arguments go as-is, so place them + # at self.sphinxopts + sphinx_args, self.sphinxopts = parser.parse_known_args(sphinxopts) + if sphinx_args.quiet == True: + self.verbose = False + + if sphinx_args.jobs: + self.n_jobs = sphinx_args.jobs + + # Command line arguments was passed, override SPHINXOPTS + if verbose is not None: + self.verbose = verbose + + self.n_jobs = n_jobs + + # Source tree directory. This needs to be at os.environ, as + # Sphinx extensions and media uAPI makefile needs it + self.srctree = os.environ.get("srctree") + if not self.srctree: + self.srctree = "." + os.environ["srctree"] = self.srctree + + # Now that we can expand srctree, get other directories as well + self.sphinxbuild = os.environ.get("SPHINXBUILD", "sphinx-build") + self.kerneldoc = self.get_path(os.environ.get("KERNELDOC", + "scripts/kernel-doc.py")) + self.obj = os.environ.get("obj", "Documentation") + self.builddir = self.get_path(os.path.join(self.obj, "output"), + abs_path=True) + + # Media uAPI needs it + os.environ["BUILDDIR"] = self.builddir + + # Detect if rust is enabled + self.config_rust = self.is_rust_enabled() + + # Get directory locations for LaTeX build toolchain + self.pdflatex_cmd = shutil.which(self.pdflatex) + self.latexmk_cmd = shutil.which("latexmk") + + self.env = os.environ.copy() + + # If venv parameter is specified, run Sphinx from venv + if venv: + bin_dir = os.path.join(venv, "bin") + if os.path.isfile(os.path.join(bin_dir, "activate")): + # "activate" virtual env + self.env["PATH"] = bin_dir + ":" + self.env["PATH"] + self.env["VIRTUAL_ENV"] = venv + if "PYTHONHOME" in self.env: + del self.env["PYTHONHOME"] + print(f"Setting venv to {venv}") + else: + sys.exit(f"Venv {venv} not found.") + + def run_sphinx(self, sphinx_build, build_args, *args, **pwargs): + """ + Executes sphinx-build using current python3 command and setting + -j parameter if possible to run the build in parallel. + """ + + with JobserverExec() as jobserver: + if jobserver.claim: + n_jobs = str(jobserver.claim) + else: + n_jobs = "auto" # Supported since Sphinx 1.7 + + cmd = [] + + if self.venv: + cmd.append("python") + else: + cmd.append(sys.executable) + + cmd.append(sphinx_build) + + # if present, SPHINXOPTS or command line --jobs overrides default + if self.n_jobs: + n_jobs = str(self.n_jobs) + + if n_jobs: + cmd += [f"-j{n_jobs}"] + + if not self.verbose: + cmd.append("-q") + + cmd += self.sphinxopts + + cmd += build_args + + if self.verbose: + print(" ".join(cmd)) + + rc = subprocess.call(cmd, *args, **pwargs) + + def handle_html(self, css, output_dir): + """ + Extra steps for HTML and epub output. + + For such targets, we need to ensure that CSS will be properly + copied to the output _static directory + """ + + if not css: + return + + css = os.path.expanduser(css) + if not css.startswith("/"): + css = os.path.join(self.srctree, css) + + static_dir = os.path.join(output_dir, "_static") + os.makedirs(static_dir, exist_ok=True) + + try: + shutil.copy2(css, static_dir) + except (OSError, IOError) as e: + print(f"Warning: Failed to copy CSS: {e}", file=sys.stderr) + + def build_pdf_file(self, latex_cmd, from_dir, path): + """Builds a single pdf file using latex_cmd""" + try: + subprocess.run(latex_cmd + [path], + cwd=from_dir, check=True) + + return True + except subprocess.CalledProcessError: + # LaTeX PDF error code is almost useless: it returns + # error codes even when build succeeds but has warnings. + # So, we'll ignore the results + return False + + def pdf_parallel_build(self, tex_suffix, latex_cmd, tex_files, n_jobs): + """Build PDF files in parallel if possible""" + builds = {} + build_failed = False + max_len = 0 + has_tex = False + + # Process files in parallel + with futures.ThreadPoolExecutor(max_workers=n_jobs) as executor: + jobs = {} + + for from_dir, pdf_dir, entry in tex_files: + name = entry.name + + if not name.endswith(tex_suffix): + continue + + name = name[:-len(tex_suffix)] + + max_len = max(max_len, len(name)) + + has_tex = True + + future = executor.submit(self.build_pdf_file, latex_cmd, + from_dir, entry.path) + jobs[future] = (from_dir, name, entry.path) + + for future in futures.as_completed(jobs): + from_dir, name, path = jobs[future] + + pdf_name = name + ".pdf" + pdf_from = os.path.join(from_dir, pdf_name) + + try: + success = future.result() + + if success and os.path.exists(pdf_from): + pdf_to = os.path.join(pdf_dir, pdf_name) + + os.rename(pdf_from, pdf_to) + builds[name] = os.path.relpath(pdf_to, self.builddir) + else: + builds[name] = "FAILED" + build_failed = True + except Exception as e: + builds[name] = f"FAILED ({str(e)})" + build_failed = True + + # Handle case where no .tex files were found + if not has_tex: + name = "Sphinx LaTeX builder" + max_len = max(max_len, len(name)) + builds[name] = "FAILED (no .tex file was generated)" + build_failed = True + + return builds, build_failed, max_len + + def handle_pdf(self, output_dirs): + """ + Extra steps for PDF output. + + As PDF is handled via a LaTeX output, after building the .tex file, + a new build is needed to create the PDF output from the latex + directory. + """ + builds = {} + max_len = 0 + tex_suffix = ".tex" + + # Get all tex files that will be used for PDF build + tex_files = [] + for from_dir in output_dirs: + pdf_dir = os.path.join(from_dir, "../pdf") + os.makedirs(pdf_dir, exist_ok=True) + + if self.latexmk_cmd: + latex_cmd = [self.latexmk_cmd, f"-{self.pdflatex}"] + else: + latex_cmd = [self.pdflatex] + + latex_cmd.extend(shlex.split(self.latexopts)) + + # Get a list of tex files to process + with os.scandir(from_dir) as it: + for entry in it: + if entry.name.endswith(tex_suffix): + tex_files.append((from_dir, pdf_dir, entry)) + + # When using make, this won't be used, as the number of jobs comes + # from POSIX jobserver. So, this covers the case where build comes + # from command line. On such case, serialize by default, except if + # the user explicitly sets the number of jobs. + n_jobs = 1 + + # n_jobs is either an integer or "auto". Only use it if it is a number + if self.n_jobs: + try: + n_jobs = int(self.n_jobs) + except ValueError: + pass + + # When using make, jobserver.claim is the number of jobs that were + # used with "-j" and that aren't used by other make targets + with JobserverExec() as jobserver: + n_jobs = 1 + + # Handle the case when a parameter is passed via command line, + # using it as default, if jobserver doesn't claim anything + if self.n_jobs: + try: + n_jobs = int(self.n_jobs) + except ValueError: + pass + + if jobserver.claim: + n_jobs = jobserver.claim + + # Build files in parallel + builds, build_failed, max_len = self.pdf_parallel_build(tex_suffix, + latex_cmd, + tex_files, + n_jobs) + + msg = "Summary" + msg += "\n" + "=" * len(msg) + print() + print(msg) + + for pdf_name, pdf_file in builds.items(): + print(f"{pdf_name:<{max_len}}: {pdf_file}") + + print() + + # return an error if a PDF file is missing + + if build_failed: + sys.exit(f"PDF build failed: not all PDF files were created.") + else: + print("All PDF files were built.") + + def handle_info(self, output_dirs): + """ + Extra steps for Info output. + + For texinfo generation, an additional make is needed from the + texinfo directory. + """ + + for output_dir in output_dirs: + try: + subprocess.run(["make", "info"], cwd=output_dir, check=True) + except subprocess.CalledProcessError as e: + sys.exit(f"Error generating info docs: {e}") + + def cleandocs(self, builder): + + shutil.rmtree(self.builddir, ignore_errors=True) + + def build(self, target, sphinxdirs=None, conf="conf.py", + theme=None, css=None, paper=None): + """ + Build documentation using Sphinx. This is the core function of this + module. It prepares all arguments required by sphinx-build. + """ + + builder = TARGETS[target]["builder"] + out_dir = TARGETS[target].get("out_dir", "") + + # Cleandocs doesn't require sphinx-build + if target == "cleandocs": + self.cleandocs(builder) + return + + # Other targets require sphinx-build + sphinxbuild = shutil.which(self.sphinxbuild, path=self.env["PATH"]) + if not sphinxbuild: + sys.exit(f"Error: {self.sphinxbuild} not found in PATH.\n") + + if builder == "latex": + if not self.pdflatex_cmd and not self.latexmk_cmd: + sys.exit("Error: pdflatex or latexmk required for PDF generation") + + docs_dir = os.path.abspath(os.path.join(self.srctree, "Documentation")) + + # Prepare base arguments for Sphinx build + kerneldoc = self.kerneldoc + if kerneldoc.startswith(self.srctree): + kerneldoc = os.path.relpath(kerneldoc, self.srctree) + + # Prepare common Sphinx options + args = [ + "-b", builder, + "-c", docs_dir, + ] + + if builder == "latex": + if not paper: + paper = PAPER[1] + + args.extend(["-D", f"latex_elements.papersize={paper}paper"]) + + if self.config_rust: + args.extend(["-t", "rustdoc"]) + + if conf: + self.env["SPHINX_CONF"] = self.get_path(conf, abs_path=True) + + if not sphinxdirs: + sphinxdirs = os.environ.get("SPHINXDIRS", ".") + + # The sphinx-build tool has a bug: internally, it tries to set + # locale with locale.setlocale(locale.LC_ALL, ''). This causes a + # crash if language is not set. Detect and fix it. + try: + locale.setlocale(locale.LC_ALL, '') + except Exception: + self.env["LC_ALL"] = "C" + self.env["LANG"] = "C" + + # sphinxdirs can be a list or a whitespace-separated string + sphinxdirs_list = [] + for sphinxdir in sphinxdirs: + if isinstance(sphinxdir, list): + sphinxdirs_list += sphinxdir + else: + for name in sphinxdir.split(" "): + sphinxdirs_list.append(name) + + # Build each directory + output_dirs = [] + for sphinxdir in sphinxdirs_list: + src_dir = os.path.join(docs_dir, sphinxdir) + doctree_dir = os.path.join(self.builddir, ".doctrees") + output_dir = os.path.join(self.builddir, sphinxdir, out_dir) + + # Make directory names canonical + src_dir = os.path.normpath(src_dir) + doctree_dir = os.path.normpath(doctree_dir) + output_dir = os.path.normpath(output_dir) + + os.makedirs(doctree_dir, exist_ok=True) + os.makedirs(output_dir, exist_ok=True) + + output_dirs.append(output_dir) + + build_args = args + [ + "-d", doctree_dir, + "-D", f"kerneldoc_bin={kerneldoc}", + "-D", f"version={self.kernelversion}", + "-D", f"release={self.kernelrelease}", + "-D", f"kerneldoc_srctree={self.srctree}", + src_dir, + output_dir, + ] + + # Execute sphinx-build + try: + self.run_sphinx(sphinxbuild, build_args, env=self.env) + except Exception as e: + sys.exit(f"Build failed: {e}") + + # Ensure that html/epub will have needed static files + if target in ["htmldocs", "epubdocs"]: + self.handle_html(css, output_dir) + + # PDF and Info require a second build step + if target == "pdfdocs": + self.handle_pdf(output_dirs) + elif target == "infodocs": + self.handle_info(output_dirs) + + @staticmethod + def get_python_version(cmd): + """ + Get python version from a Python binary. As we need to detect if + are out there newer python binaries, we can't rely on sys.release here. + """ + + result = subprocess.run([cmd, "--version"], check=True, + stdout=subprocess.PIPE, stderr=subprocess.PIPE, + universal_newlines=True) + version = result.stdout.strip() + + match = re.search(r"(\d+\.\d+\.\d+)", version) + if match: + return parse_version(match.group(1)) + + print(f"Can't parse version {version}") + return (0, 0, 0) + + @staticmethod + def find_python(): + """ + Detect if are out there any python 3.xy version newer than the + current one. + + Note: this routine is limited to up to 2 digits for python3. We + may need to update it one day, hopefully on a distant future. + """ + patterns = [ + "python3.[0-9]", + "python3.[0-9][0-9]", + ] + + # Seek for a python binary newer than MIN_PYTHON_VERSION + for path in os.getenv("PATH", "").split(":"): + for pattern in patterns: + for cmd in glob(os.path.join(path, pattern)): + if os.path.isfile(cmd) and os.access(cmd, os.X_OK): + version = SphinxBuilder.get_python_version(cmd) + if version >= MIN_PYTHON_VERSION: + return cmd + + return None + + @staticmethod + def check_python(): + """ + Check if the current python binary satisfies our minimal requirement + for Sphinx build. If not, re-run with a newer version if found. + """ + cur_ver = sys.version_info[:3] + if cur_ver >= MIN_PYTHON_VERSION: + return + + python_ver = ver_str(cur_ver) + + new_python_cmd = SphinxBuilder.find_python() + if not new_python_cmd: + sys.exit(f"Python version {python_ver} is not supported anymore.") + + # Restart script using the newer version + script_path = os.path.abspath(sys.argv[0]) + args = [new_python_cmd, script_path] + sys.argv[1:] + + print(f"Python {python_ver} not supported. Changing to {new_python_cmd}") + + try: + os.execv(new_python_cmd, args) + except OSError as e: + sys.exit(f"Failed to restart with {new_python_cmd}: {e}") + +def jobs_type(value): + """ + Handle valid values for -j. Accepts Sphinx "-jauto", plus a number + equal or bigger than one. + """ + if value is None: + return None + + if value.lower() == 'auto': + return value.lower() + + try: + if int(value) >= 1: + return value + + raise argparse.ArgumentTypeError(f"Minimum jobs is 1, got {value}") + except ValueError: + raise argparse.ArgumentTypeError(f"Must be 'auto' or positive integer, got {value}") + +def main(): + """ + Main function. The only mandatory argument is the target. If not + specified, the other arguments will use default values if not + specified at os.environ. + """ + parser = argparse.ArgumentParser(description="Kernel documentation builder") + + parser.add_argument("target", choices=list(TARGETS.keys()), + help="Documentation target to build") + parser.add_argument("--sphinxdirs", nargs="+", + help="Specific directories to build") + parser.add_argument("--conf", default="conf.py", + help="Sphinx configuration file") + + parser.add_argument("--theme", help="Sphinx theme to use") + + parser.add_argument("--css", help="Custom CSS file for HTML/EPUB") + + parser.add_argument("--paper", choices=PAPER, default=PAPER[0], + help="Paper size for LaTeX/PDF output") + + parser.add_argument("-v", "--verbose", action='store_true', + help="place build in verbose mode") + + parser.add_argument('-j', '--jobs', type=jobs_type, + help="Sets number of jobs to use with sphinx-build") + + parser.add_argument('-i', '--interactive', action='store_true', + help="Change latex default to run in interactive mode") + + parser.add_argument("-V", "--venv", nargs='?', const=f'{VENV_DEFAULT}', + default=None, + help=f'If used, run Sphinx from a venv dir (default dir: {VENV_DEFAULT})') + + args = parser.parse_args() + + SphinxBuilder.check_python() + + builder = SphinxBuilder(venv=args.venv, verbose=args.verbose, + n_jobs=args.jobs, interactive=args.interactive) + + builder.build(args.target, sphinxdirs=args.sphinxdirs, conf=args.conf, + theme=args.theme, css=args.css, paper=args.paper) + +if __name__ == "__main__": + main() diff --git a/tools/docs/lib/__init__.py b/tools/docs/lib/__init__.py new file mode 100644 index 000000000000..e69de29bb2d1 --- /dev/null +++ b/tools/docs/lib/__init__.py diff --git a/tools/docs/lib/enrich_formatter.py b/tools/docs/lib/enrich_formatter.py new file mode 100644 index 000000000000..bb171567a4ca --- /dev/null +++ b/tools/docs/lib/enrich_formatter.py @@ -0,0 +1,70 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 +# Copyright (c) 2025 by Mauro Carvalho Chehab <mchehab@kernel.org>. + +""" +Ancillary argparse HelpFormatter class that works on a similar way as +argparse.RawDescriptionHelpFormatter, e.g. description maintains line +breaks, but it also implement transformations to the help text. The +actual transformations ar given by enrich_text(), if the output is tty. + +Currently, the follow transformations are done: + + - Positional arguments are shown in upper cases; + - if output is TTY, ``var`` and positional arguments are shown prepended + by an ANSI SGR code. This is usually translated to bold. On some + terminals, like, konsole, this is translated into a colored bold text. +""" + +import argparse +import re +import sys + +class EnrichFormatter(argparse.HelpFormatter): + """ + Better format the output, making easier to identify the positional args + and how they're used at the __doc__ description. + """ + def __init__(self, *args, **kwargs): + """Initialize class and check if is TTY""" + super().__init__(*args, **kwargs) + self._tty = sys.stdout.isatty() + + def enrich_text(self, text): + """Handle ReST markups (currently, only ``foo``)""" + if self._tty and text: + # Replace ``text`` with ANSI SGR (bold) + return re.sub(r'\`\`(.+?)\`\`', + lambda m: f'\033[1m{m.group(1)}\033[0m', text) + return text + + def _fill_text(self, text, width, indent): + """Enrich descriptions with markups on it""" + enriched = self.enrich_text(text) + return "\n".join(indent + line for line in enriched.splitlines()) + + def _format_usage(self, usage, actions, groups, prefix): + """Enrich positional arguments at usage: line""" + + prog = self._prog + parts = [] + + for action in actions: + if action.option_strings: + opt = action.option_strings[0] + if action.nargs != 0: + opt += f" {action.dest.upper()}" + parts.append(f"[{opt}]") + else: + # Positional argument + parts.append(self.enrich_text(f"``{action.dest.upper()}``")) + + usage_text = f"{prefix or 'usage: '} {prog} {' '.join(parts)}\n" + return usage_text + + def _format_action_invocation(self, action): + """Enrich argument names""" + if not action.option_strings: + return self.enrich_text(f"``{action.dest.upper()}``") + + return ", ".join(action.option_strings) diff --git a/tools/docs/lib/parse_data_structs.py b/tools/docs/lib/parse_data_structs.py new file mode 100755 index 000000000000..a5aa2e182052 --- /dev/null +++ b/tools/docs/lib/parse_data_structs.py @@ -0,0 +1,452 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 +# Copyright (c) 2016-2025 by Mauro Carvalho Chehab <mchehab@kernel.org>. +# pylint: disable=R0912,R0915 + +""" +Parse a source file or header, creating ReStructured Text cross references. + +It accepts an optional file to change the default symbol reference or to +suppress symbols from the output. + +It is capable of identifying defines, functions, structs, typedefs, +enums and enum symbols and create cross-references for all of them. +It is also capable of distinguish #define used for specifying a Linux +ioctl. + +The optional rules file contains a set of rules like: + + ignore ioctl VIDIOC_ENUM_FMT + replace ioctl VIDIOC_DQBUF vidioc_qbuf + replace define V4L2_EVENT_MD_FL_HAVE_FRAME_SEQ :c:type:`v4l2_event_motion_det` +""" + +import os +import re +import sys + + +class ParseDataStructs: + """ + Creates an enriched version of a Kernel header file with cross-links + to each C data structure type. + + It is meant to allow having a more comprehensive documentation, where + uAPI headers will create cross-reference links to the code. + + It is capable of identifying defines, functions, structs, typedefs, + enums and enum symbols and create cross-references for all of them. + It is also capable of distinguish #define used for specifying a Linux + ioctl. + + By default, it create rules for all symbols and defines, but it also + allows parsing an exception file. Such file contains a set of rules + using the syntax below: + + 1. Ignore rules: + + ignore <type> <symbol>` + + Removes the symbol from reference generation. + + 2. Replace rules: + + replace <type> <old_symbol> <new_reference> + + Replaces how old_symbol with a new reference. The new_reference can be: + - A simple symbol name; + - A full Sphinx reference. + + On both cases, <type> can be: + - ioctl: for defines that end with _IO*, e.g. ioctl definitions + - define: for other defines + - symbol: for symbols defined within enums; + - typedef: for typedefs; + - enum: for the name of a non-anonymous enum; + - struct: for structs. + + Examples: + + ignore define __LINUX_MEDIA_H + ignore ioctl VIDIOC_ENUM_FMT + replace ioctl VIDIOC_DQBUF vidioc_qbuf + replace define V4L2_EVENT_MD_FL_HAVE_FRAME_SEQ :c:type:`v4l2_event_motion_det` + """ + + # Parser regexes with multiple ways to capture enums and structs + RE_ENUMS = [ + re.compile(r"^\s*enum\s+([\w_]+)\s*\{"), + re.compile(r"^\s*enum\s+([\w_]+)\s*$"), + re.compile(r"^\s*typedef\s*enum\s+([\w_]+)\s*\{"), + re.compile(r"^\s*typedef\s*enum\s+([\w_]+)\s*$"), + ] + RE_STRUCTS = [ + re.compile(r"^\s*struct\s+([_\w][\w\d_]+)\s*\{"), + re.compile(r"^\s*struct\s+([_\w][\w\d_]+)$"), + re.compile(r"^\s*typedef\s*struct\s+([_\w][\w\d_]+)\s*\{"), + re.compile(r"^\s*typedef\s*struct\s+([_\w][\w\d_]+)$"), + ] + + # FIXME: the original code was written a long time before Sphinx C + # domain to have multiple namespaces. To avoid to much turn at the + # existing hyperlinks, the code kept using "c:type" instead of the + # right types. To change that, we need to change the types not only + # here, but also at the uAPI media documentation. + DEF_SYMBOL_TYPES = { + "ioctl": { + "prefix": "\\ ", + "suffix": "\\ ", + "ref_type": ":ref", + "description": "IOCTL Commands", + }, + "define": { + "prefix": "\\ ", + "suffix": "\\ ", + "ref_type": ":ref", + "description": "Macros and Definitions", + }, + # We're calling each definition inside an enum as "symbol" + "symbol": { + "prefix": "\\ ", + "suffix": "\\ ", + "ref_type": ":ref", + "description": "Enumeration values", + }, + "typedef": { + "prefix": "\\ ", + "suffix": "\\ ", + "ref_type": ":c:type", + "description": "Type Definitions", + }, + # This is the description of the enum itself + "enum": { + "prefix": "\\ ", + "suffix": "\\ ", + "ref_type": ":c:type", + "description": "Enumerations", + }, + "struct": { + "prefix": "\\ ", + "suffix": "\\ ", + "ref_type": ":c:type", + "description": "Structures", + }, + } + + def __init__(self, debug: bool = False): + """Initialize internal vars""" + self.debug = debug + self.data = "" + + self.symbols = {} + + for symbol_type in self.DEF_SYMBOL_TYPES: + self.symbols[symbol_type] = {} + + def store_type(self, symbol_type: str, symbol: str, + ref_name: str = None, replace_underscores: bool = True): + """ + Stores a new symbol at self.symbols under symbol_type. + + By default, underscores are replaced by "-" + """ + defs = self.DEF_SYMBOL_TYPES[symbol_type] + + prefix = defs.get("prefix", "") + suffix = defs.get("suffix", "") + ref_type = defs.get("ref_type") + + # Determine ref_link based on symbol type + if ref_type: + if symbol_type == "enum": + ref_link = f"{ref_type}:`{symbol}`" + else: + if not ref_name: + ref_name = symbol.lower() + + # c-type references don't support hash + if ref_type == ":ref" and replace_underscores: + ref_name = ref_name.replace("_", "-") + + ref_link = f"{ref_type}:`{symbol} <{ref_name}>`" + else: + ref_link = symbol + + self.symbols[symbol_type][symbol] = f"{prefix}{ref_link}{suffix}" + + def store_line(self, line): + """Stores a line at self.data, properly indented""" + line = " " + line.expandtabs() + self.data += line.rstrip(" ") + + def parse_file(self, file_in: str): + """Reads a C source file and get identifiers""" + self.data = "" + is_enum = False + is_comment = False + multiline = "" + + with open(file_in, "r", + encoding="utf-8", errors="backslashreplace") as f: + for line_no, line in enumerate(f): + self.store_line(line) + line = line.strip("\n") + + # Handle continuation lines + if line.endswith(r"\\"): + multiline += line[-1] + continue + + if multiline: + line = multiline + line + multiline = "" + + # Handle comments. They can be multilined + if not is_comment: + if re.search(r"/\*.*", line): + is_comment = True + else: + # Strip C99-style comments + line = re.sub(r"(//.*)", "", line) + + if is_comment: + if re.search(r".*\*/", line): + is_comment = False + else: + multiline = line + continue + + # At this point, line variable may be a multilined statement, + # if lines end with \ or if they have multi-line comments + # With that, it can safely remove the entire comments, + # and there's no need to use re.DOTALL for the logic below + + line = re.sub(r"(/\*.*\*/)", "", line) + if not line.strip(): + continue + + # It can be useful for debug purposes to print the file after + # having comments stripped and multi-lines grouped. + if self.debug > 1: + print(f"line {line_no + 1}: {line}") + + # Now the fun begins: parse each type and store it. + + # We opted for a two parsing logic here due to: + # 1. it makes easier to debug issues not-parsed symbols; + # 2. we want symbol replacement at the entire content, not + # just when the symbol is detected. + + if is_enum: + match = re.match(r"^\s*([_\w][\w\d_]+)\s*[\,=]?", line) + if match: + self.store_type("symbol", match.group(1)) + if "}" in line: + is_enum = False + continue + + match = re.match(r"^\s*#\s*define\s+([\w_]+)\s+_IO", line) + if match: + self.store_type("ioctl", match.group(1), + replace_underscores=False) + continue + + match = re.match(r"^\s*#\s*define\s+([\w_]+)(\s+|$)", line) + if match: + self.store_type("define", match.group(1)) + continue + + match = re.match(r"^\s*typedef\s+([_\w][\w\d_]+)\s+(.*)\s+([_\w][\w\d_]+);", + line) + if match: + name = match.group(2).strip() + symbol = match.group(3) + self.store_type("typedef", symbol, ref_name=name) + continue + + for re_enum in self.RE_ENUMS: + match = re_enum.match(line) + if match: + self.store_type("enum", match.group(1)) + is_enum = True + break + + for re_struct in self.RE_STRUCTS: + match = re_struct.match(line) + if match: + self.store_type("struct", match.group(1)) + break + + def process_exceptions(self, fname: str): + """ + Process exceptions file with rules to ignore or replace references. + """ + if not fname: + return + + name = os.path.basename(fname) + + with open(fname, "r", encoding="utf-8", errors="backslashreplace") as f: + for ln, line in enumerate(f): + ln += 1 + line = line.strip() + if not line or line.startswith("#"): + continue + + # Handle ignore rules + match = re.match(r"^ignore\s+(\w+)\s+(\S+)", line) + if match: + c_type = match.group(1) + symbol = match.group(2) + + if c_type not in self.DEF_SYMBOL_TYPES: + sys.exit(f"{name}:{ln}: {c_type} is invalid") + + d = self.symbols[c_type] + if symbol in d: + del d[symbol] + + continue + + # Handle replace rules + match = re.match(r"^replace\s+(\S+)\s+(\S+)\s+(\S+)", line) + if not match: + sys.exit(f"{name}:{ln}: invalid line: {line}") + + c_type, old, new = match.groups() + + if c_type not in self.DEF_SYMBOL_TYPES: + sys.exit(f"{name}:{ln}: {c_type} is invalid") + + reftype = None + + # Parse reference type when the type is specified + + match = re.match(r"^\:c\:(data|func|macro|type)\:\`(.+)\`", new) + if match: + reftype = f":c:{match.group(1)}" + new = match.group(2) + else: + match = re.search(r"(\:ref)\:\`(.+)\`", new) + if match: + reftype = match.group(1) + new = match.group(2) + + # If the replacement rule doesn't have a type, get default + if not reftype: + reftype = self.DEF_SYMBOL_TYPES[c_type].get("ref_type") + if not reftype: + reftype = self.DEF_SYMBOL_TYPES[c_type].get("real_type") + + new_ref = f"{reftype}:`{old} <{new}>`" + + # Change self.symbols to use the replacement rule + if old in self.symbols[c_type]: + self.symbols[c_type][old] = new_ref + else: + print(f"{name}:{ln}: Warning: can't find {old} {c_type}") + + def debug_print(self): + """ + Print debug information containing the replacement rules per symbol. + To make easier to check, group them per type. + """ + if not self.debug: + return + + for c_type, refs in self.symbols.items(): + if not refs: # Skip empty dictionaries + continue + + print(f"{c_type}:") + + for symbol, ref in sorted(refs.items()): + print(f" {symbol} -> {ref}") + + print() + + def gen_output(self): + """Write the formatted output to a file.""" + + # Avoid extra blank lines + text = re.sub(r"\s+$", "", self.data) + "\n" + text = re.sub(r"\n\s+\n", "\n\n", text) + + # Escape Sphinx special characters + text = re.sub(r"([\_\`\*\<\>\&\\\\:\/\|\%\$\#\{\}\~\^])", r"\\\1", text) + + # Source uAPI files may have special notes. Use bold font for them + text = re.sub(r"DEPRECATED", "**DEPRECATED**", text) + + # Delimiters to catch the entire symbol after escaped + start_delim = r"([ \n\t\(=\*\@])" + end_delim = r"(\s|,|\\=|\\:|\;|\)|\}|\{)" + + # Process all reference types + for ref_dict in self.symbols.values(): + for symbol, replacement in ref_dict.items(): + symbol = re.escape(re.sub(r"([\_\`\*\<\>\&\\\\:\/])", r"\\\1", symbol)) + text = re.sub(fr'{start_delim}{symbol}{end_delim}', + fr'\1{replacement}\2', text) + + # Remove "\ " where not needed: before spaces and at the end of lines + text = re.sub(r"\\ ([\n ])", r"\1", text) + text = re.sub(r" \\ ", " ", text) + + return text + + def gen_toc(self): + """ + Create a TOC table pointing to each symbol from the header + """ + text = [] + + # Add header + text.append(".. contents:: Table of Contents") + text.append(" :depth: 2") + text.append(" :local:") + text.append("") + + # Sort symbol types per description + symbol_descriptions = [] + for k, v in self.DEF_SYMBOL_TYPES.items(): + symbol_descriptions.append((v['description'], k)) + + symbol_descriptions.sort() + + # Process each category + for description, c_type in symbol_descriptions: + + refs = self.symbols[c_type] + if not refs: # Skip empty categories + continue + + text.append(f"{description}") + text.append("-" * len(description)) + text.append("") + + # Sort symbols alphabetically + for symbol, ref in sorted(refs.items()): + text.append(f"* :{ref}:") + + text.append("") # Add empty line between categories + + return "\n".join(text) + + def write_output(self, file_in: str, file_out: str, toc: bool): + title = os.path.basename(file_in) + + if toc: + text = self.gen_toc() + else: + text = self.gen_output() + + with open(file_out, "w", encoding="utf-8", errors="backslashreplace") as f: + f.write(".. -*- coding: utf-8; mode: rst -*-\n\n") + f.write(f"{title}\n") + f.write("=" * len(title) + "\n\n") + + if not toc: + f.write(".. parsed-literal::\n\n") + + f.write(text) diff --git a/tools/docs/parse-headers.py b/tools/docs/parse-headers.py new file mode 100755 index 000000000000..bfa4e46a53e3 --- /dev/null +++ b/tools/docs/parse-headers.py @@ -0,0 +1,60 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 +# Copyright (c) 2016, 2025 by Mauro Carvalho Chehab <mchehab@kernel.org>. +# pylint: disable=C0103 + +""" +Convert a C header or source file ``FILE_IN``, into a ReStructured Text +included via ..parsed-literal block with cross-references for the +documentation files that describe the API. It accepts an optional +``FILE_RULES`` file to describes what elements will be either ignored or +be pointed to a non-default reference type/name. + +The output is written at ``FILE_OUT``. + +It is capable of identifying defines, functions, structs, typedefs, +enums and enum symbols and create cross-references for all of them. +It is also capable of distinguish #define used for specifying a Linux +ioctl. + +The optional ``FILE_RULES`` contains a set of rules like: + + ignore ioctl VIDIOC_ENUM_FMT + replace ioctl VIDIOC_DQBUF vidioc_qbuf + replace define V4L2_EVENT_MD_FL_HAVE_FRAME_SEQ :c:type:`v4l2_event_motion_det` +""" + +import argparse + +from lib.parse_data_structs import ParseDataStructs +from lib.enrich_formatter import EnrichFormatter + +def main(): + """Main function""" + parser = argparse.ArgumentParser(description=__doc__, + formatter_class=EnrichFormatter) + + parser.add_argument("-d", "--debug", action="count", default=0, + help="Increase debug level. Can be used multiple times") + parser.add_argument("-t", "--toc", action="store_true", + help="instead of a literal block, outputs a TOC table at the RST file") + + parser.add_argument("file_in", help="Input C file") + parser.add_argument("file_out", help="Output RST file") + parser.add_argument("file_rules", nargs="?", + help="Exceptions file (optional)") + + args = parser.parse_args() + + parser = ParseDataStructs(debug=args.debug) + parser.parse_file(args.file_in) + + if args.file_rules: + parser.process_exceptions(args.file_rules) + + parser.debug_print() + parser.write_output(args.file_in, args.file_out, args.toc) + + +if __name__ == "__main__": + main() |
