Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Much slower than MD4C #50

Open
nuttyartist opened this issue Aug 29, 2023 · 3 comments
Open

Much slower than MD4C #50

nuttyartist opened this issue Aug 29, 2023 · 3 comments
Labels

Comments

@nuttyartist
Copy link

nuttyartist commented Aug 29, 2023

Hello! Thanks for this library. I was wondering why for the same text I got such a difference performance:

Maddy took 5304 milliseconds
Qt took 5 milliseconds

Maddy code:

std::stringstream markdownInput("some text...");
m_markdownParser->Parse(markdownInput);

Qt code:

QString markdownInput("some text...");
QTextDocument textDoc;
textDoc.setMarkdown(markdownInput);
textDoc.toHtml();

EDIT: By mistake I set it as a feature request.

@nuttyartist nuttyartist added the feature Feature Request label Aug 29, 2023
@progsource progsource removed the feature Feature Request label Aug 29, 2023
@progsource
Copy link
Owner

When it comes to performance tests there are certain things that play into results, for example:

  • Operating System
  • currently running apps on the system (so any other running processes, that can slow down a test)
  • How many times did you run the tests?

So currently it is difficult to know the exact reasons for your results.

Besides that maddy's regex way of doing things might slow down currently processing Markdown. In version 2 I plan to remove the usage of regex and go with another approach which hopefully will speed maddy up. (Which I - of course - will benchmark)
But until then maddy might not be the fastest solution.

I'm working every now and then on version 2, but cannot commit yet to a release date due to RL and maddy being a side-project.

Of course - if somebody finds a way to speed things up a little in the meantime - I'm always happy for contributions.

@nuttyartist
Copy link
Author

nuttyartist commented Sep 1, 2023

Excuse my late reply. Here's a reproducible test with the first chapter of Moby Dick in Markdown: https://gist.github.com/nuttyartist/cb0053ccda823ac98a7ce58f296269cc

I got somewhat consistent results of the following:
During Debug mode:

Maddy took 84380 milliseconds
MD4C took 0 milliseconds

During Release mode:

Maddy took 17552 milliseconds
MD4C took 0 milliseconds

EDIT: I edited the title after realizing Qt is using MD4C underneath.

@nuttyartist nuttyartist changed the title Much slower than Qt Much slower than MD4C Sep 1, 2023
@vedderb
Copy link

vedderb commented Nov 13, 2023

I ran into the performance-issue too and for me that almost makes maddy unusable. After some profiling and testing I found that the culprits are the following parsers:

EMPHASIZED_PARSER
ITALIC_PARSER
STRIKETHROUGH_PARSER
STRONG_PARSER

What they have in common is a long regexp that seems to take long to evaluate. I don't know if this breaks anything, but I replaced them with the following loops:

EmphasizedParser

void
  Parse(std::string& line) override
  {
      std::string pattern = "_";
      std::string newPattern = "em";

      for (;;) {
          int patlen = pattern.size();

          auto pos1 = line.find(pattern);
          if (pos1 == std::string::npos) {
              break;
          }

          auto pos2 = line.find(pattern, pos1 + patlen);
          if (pos2 == std::string::npos) {
              break;
          }

          std::string word = line.substr(pos1 + patlen, pos2 - pos1 - patlen);
          line = line.replace(pos1, (patlen + pos2) - pos1, "<" + newPattern + ">" + word + "</" + newPattern + ">");
      }
  }

ItalicParser

void
  Parse(std::string& line) override
  {
      std::string pattern = "*";
      std::string newPattern = "i";

      for (;;) {
          int patlen = pattern.size();

          auto pos1 = line.find(pattern);
          if (pos1 == std::string::npos) {
              break;
          }

          auto pos2 = line.find(pattern, pos1 + patlen);
          if (pos2 == std::string::npos) {
              break;
          }

          std::string word = line.substr(pos1 + patlen, pos2 - pos1 - patlen);
          line = line.replace(pos1, (patlen + pos2) - pos1, "<" + newPattern + ">" + word + "</" + newPattern + ">");
      }
  }

StrikeThroughParser

void
  Parse(std::string& line) override
  {
      std::string pattern = "~~";
      std::string newPattern = "s";

      for (;;) {
          int patlen = pattern.size();

          auto pos1 = line.find(pattern);
          if (pos1 == std::string::npos) {
              break;
          }

          auto pos2 = line.find(pattern, pos1 + patlen);
          if (pos2 == std::string::npos) {
              break;
          }

          std::string word = line.substr(pos1 + patlen, pos2 - pos1 - patlen);
          line = line.replace(pos1, (patlen + pos2) - pos1, "<" + newPattern + ">" + word + "</" + newPattern + ">");
      }
  }

StrongParser

void
  Parse(std::string& line) override
  {
      std::string pattern = "**";
      std::string newPattern = "strong";

      for (;;) {
          int patlen = pattern.size();

          auto pos1 = line.find(pattern);
          if (pos1 == std::string::npos) {
              break;
          }

          auto pos2 = line.find(pattern, pos1 + patlen);
          if (pos2 == std::string::npos) {
              break;
          }

          std::string word = line.substr(pos1 + patlen, pos2 - pos1 - patlen);
          line = line.replace(pos1, (patlen + pos2) - pos1, "<" + newPattern + ">" + word + "</" + newPattern + ">");
      }

      pattern = "__";

      for (;;) {
          int patlen = pattern.size();

          auto pos1 = line.find(pattern);
          if (pos1 == std::string::npos) {
              break;
          }

          auto pos2 = line.find(pattern, pos1 + patlen);
          if (pos2 == std::string::npos) {
              break;
          }

          std::string word = line.substr(pos1 + patlen, pos2 - pos1 - patlen);
          line = line.replace(pos1, (patlen + pos2) - pos1, "<" + newPattern + ">" + word + "</" + newPattern + ">");
      }
  }

I didn't measure how much faster this is, but my application went from being very laggy when parsing markdown-files to no lag that I can notice at all.

This is just a quick fix and I don't have time at the moment to clean it up and test it more, otherwise I would make a pull request. Just sharing it hoping that it is useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants