After all, Markdown contains no unnecessary visual or functional resources, focusing instead on the content. Furthermore, the multitude of HTML tags can easily be eliminated during transmission.
Setting this up requires only a small piece of middleware within your site template extension. In the example below, the website immediately returns Markdown instead of HTML if the requesting browser or crawler supports Markdown:
<?php
declare(strict_types=1);
namespace In2code\In2template\Middleware;
use League\HTMLToMarkdown\HtmlConverterInterface;
use Psr\Http\Message\ResponseInterface;
use Psr\Http\Message\ServerRequestInterface;
use Psr\Http\Message\StreamFactoryInterface;
use Psr\Http\Server\MiddlewareInterface;
use Psr\Http\Server\RequestHandlerInterface;
/**
* Returns a Markdown representation of the rendered page when the client
* announces `Accept: text/markdown`. The HTML pipeline runs unchanged; this
* middleware only post-processes the resulting body.
*/
final class MarkdownContentNegotiation implements MiddlewareInterface
{
private const string ACCEPT_TOKEN = 'text/markdown';
private const string CONTENT_WRAPPER_ID = 'content';
private const int TOKEN_CHAR_RATIO = 4;
public function __construct(
private readonly HtmlConverterInterface $htmlConverter,
private readonly StreamFactoryInterface $streamFactory,
) {
}
public function process(ServerRequestInterface $request, RequestHandlerInterface $handler): ResponseInterface
{
$response = $handler->handle($request);
if ($this->isMarkdownNegotiated($request) && $this->isHtmlResponse($response)) {
$response = $this->convertResponseToMarkdown($response);
}
return $response->withAddedHeader('Vary', 'Accept');
}
private function isMarkdownNegotiated(ServerRequestInterface $request): bool
{
return str_contains($request->getHeaderLine('Accept'), self::ACCEPT_TOKEN);
}
private function isHtmlResponse(ResponseInterface $response): bool
{
return str_contains($response->getHeaderLine('Content-Type'), 'text/html');
}
private function convertResponseToMarkdown(ResponseInterface $response): ResponseInterface
{
$html = (string)$response->getBody();
$contentFragment = $this->extractContentFragment($html);
$markdown = trim($this->htmlConverter->convert($contentFragment));
return $response
->withHeader('Content-Type', 'text/markdown; charset=utf-8')
->withHeader('X-Markdown-Tokens', (string)$this->estimateTokenCount($markdown))
->withHeader('ETag', '"' . md5($markdown) . '"')
->withoutHeader('Content-Length')
->withBody($this->streamFactory->createStream($markdown));
}
private function extractContentFragment(string $html): string
{
$fragment = $html;
$previousState = libxml_use_internal_errors(true);
$dom = new \DOMDocument();
if ($dom->loadHTML('<?xml encoding="UTF-8">' . $html, LIBXML_NOERROR | LIBXML_NOWARNING)) {
$contentNode = $dom->getElementById(self::CONTENT_WRAPPER_ID);
if ($contentNode !== null) {
$fragment = (string)$dom->saveHTML($contentNode);
}
}
libxml_clear_errors();
libxml_use_internal_errors($previousState);
return $fragment;
}
private function estimateTokenCount(string $markdown): int
{
return (int)ceil(mb_strlen($markdown) / self::TOKEN_CHAR_RATIO);
}
}
For this to work, however, you still need a third-party package that you can easily add via composer.json:
{
"name": "in2code/cms-boilerplate",
"description": "in2code GmbH TYPO3 CMS Boilerplate",
"license": "GPL-2.0",
"require": {
"league/html-to-markdown": "^5.1",
...
Tip: You can test for yourself whether it actually works using a simple curl command:
curl -ks -H 'Accept: text/markdown' https:// local.website.de/




