Skip to content

[clangd] Add InsertReplaceEdit for code completion#187623

Merged
timon-ul merged 8 commits into
llvm:mainfrom
argothiel:argothiel/insertReplace
May 10, 2026
Merged

[clangd] Add InsertReplaceEdit for code completion#187623
timon-ul merged 8 commits into
llvm:mainfrom
argothiel:argothiel/insertReplace

Conversation

@argothiel

@argothiel argothiel commented Mar 20, 2026

Copy link
Copy Markdown
Contributor

Handle new insertReplaceSupport capability (defined in LSP 3.16). Add the new option to the protocol layer and pass it around to the code completion logic. Update CompletionItem::textEdit to become the union type as per the LSP specification.

Add a new helper function to the Lexer public API to find the end of an identifier with full context lexing, to avoid duplicating the logic. Use the helper both in the Sema flow and in the comment completion flow. Use a simpler ASCII-only scan in no-Sema mode.

Add LIT tests to verify auto-triggered completions, mid-word replacement, Unicode, and snippets. Add unit tests to verify insert/replace ranges with and without Sema, including comments and the feature-off case.

Update the release notes to document the new capability.

Fixes clangd/clangd#2190

When insertReplaceSupport from LSP 3.16 is implemented, a completion
range could be either an insert range or a replace range. This commit
renames the existing completion ranges as insert ranges, so that the
replace ranges can be added in a non-confusing way.

This commit only makes sense with the follow-up implementation of the
insertReplaceSupport.

Included renames:
- CodeCompleteFlow::ReplacedRange -> InsertRange
- CodeCompleteResult::CompletionRange -> InsertRange
- CodeCompletion::CompletionTokenRange -> CompletionInsertRange
- codeCompleteComment: Offset -> OutsideStartOffset
- codeCompleteComment: CompletionRange -> InsertRange
Handle new insertReplaceSupport capability (LSP 3.16). Add
InsertReplaceEdit to the protocol layer and pass it around to the code
completion logic. Update CompletionItem::textEdit to become the union
type as per the LSP specification.

Add a new helper function to the Lexer to find the end of the identifier
with full context lexing, to avoid duplicating the logic. Use the helper
both in Sema flow, and in the comment completion flow. Use a simpler
ASCII-only scan in no-compile mode.

Add LIT tests to verify auto-triggered completions, mid-word
replacement, unicode, and snippets. Add unit tests to verify
insert/replace ranges with and without Sema, including comments and the
feature-off case.

Update the release notes to document the new capability.

Fixes clangd/clangd#2190
@github-actions

Copy link
Copy Markdown

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@argothiel

argothiel commented Mar 20, 2026

Copy link
Copy Markdown
Contributor Author

The pull request is split into two commits, to separate the name changes from the functional logic, for easier reviewing only. The name changes don't make sense on their own, that's why they're not in a separate PR. The intent is to squash both commits before merging.

const SemaCompleteInput &Input,
IncludeStructure *Includes = nullptr) {
IncludeStructure *Includes = nullptr,
std::unique_ptr<CompilerInvocation> CI = nullptr) {

@argothiel argothiel Mar 20, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the codeCompleteComment path, LangOpts are required before this function call. Thus, adding here the possiblity to call semaCodeComplete with an already calculated CI, to avoid double work.

const auto &CodeCompletionRange = CharSourceRange::getCharRange(
Recorder->CCSema->getPreprocessor().getCodeCompletionTokenRange());

const SourceManager &SM = Recorder->CCSema->getSourceManager();

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extracting Recorder->CCSema->getSourceManager() as it was already used twice in the code, and I didn't want to copy it for the third time for my change.

SM, Lexer::findEndOfIdentifierContinuation(SuffixBegin, SM, LangOpts));
// Adjust for the NUL byte inserted at the cursor by code completion,
// which inflates the column by 1.
End.character--;

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels a bit hacky. If anyone has an idea how not to go off by one here, I'm open for suggestions.

// If Offset is inside what looks like argument comment (e.g.
// "/*^*/" or "/* foo = ^*/"), returns the offset pointing past the closing "*/".
static std::optional<unsigned>
maybeFunctionArgumentCommentEnd(const PathRef FileName, const unsigned Offset,

@argothiel argothiel Mar 20, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not consistent with finding the beginning of the comment (which is ASCII-only), but it's definitely more correct, and we always have compiler invocation context here, so why not use it.

Finding the beginning of the comment can be fixed to use more comprehensive lexing in the future, but it's outside the scope of this change. As searching forward and backward are pretty different actions, it will be unrelated code anyway.

break;
}
}
if (auto IRSupport = Item->getBoolean("insertReplaceSupport"))

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be const, but const is too much underused in the entire function, so I went for consistency with other fields instead.

@argothiel argothiel marked this pull request as ready for review March 20, 2026 02:49
@llvmbot llvmbot added clang-tools-extra clangd clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Mar 20, 2026
@llvmbot

llvmbot commented Mar 20, 2026

Copy link
Copy Markdown
Member

@llvm/pr-subscribers-clangd

Author: None (argothiel)

Changes

Handle new insertReplaceSupport capability (LSP 3.16). Add the new option to the protocol layer and pass it around to the code completion logic. Update CompletionItem::textEdit to become the union type as per the LSP specification.

Add a new helper function to the Lexer to find the end of the identifier with full context lexing, to avoid duplicating the logic. Use the helper both in Sema flow, and in the comment completion flow. Use a simpler ASCII-only scan in no-compile mode.

Add LIT tests to verify auto-triggered completions, mid-word replacement, unicode, and snippets. Add unit tests to verify insert/replace ranges with and without Sema, including comments and the feature-off case.

Update the release notes to document the new capability.

Fixes clangd/clangd#2190


Patch is 48.84 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/187623.diff

13 Files Affected:

  • (modified) clang-tools-extra/clangd/ClangdLSPServer.cpp (+1)
  • (modified) clang-tools-extra/clangd/CodeComplete.cpp (+147-36)
  • (modified) clang-tools-extra/clangd/CodeComplete.h (+13-3)
  • (modified) clang-tools-extra/clangd/Protocol.cpp (+12-1)
  • (modified) clang-tools-extra/clangd/Protocol.h (+24-3)
  • (added) clang-tools-extra/clangd/test/completion-auto-trigger-replace.test (+113)
  • (added) clang-tools-extra/clangd/test/completion-replace.test (+166)
  • (added) clang-tools-extra/clangd/test/completion-snippets-replace.test (+68)
  • (modified) clang-tools-extra/clangd/unittests/CodeCompleteTests.cpp (+139-15)
  • (modified) clang-tools-extra/docs/ReleaseNotes.rst (+4)
  • (modified) clang/include/clang/Lex/Lexer.h (+8)
  • (modified) clang/lib/Lex/Lexer.cpp (+22)
  • (modified) clang/unittests/Lex/LexerTest.cpp (+34)
diff --git a/clang-tools-extra/clangd/ClangdLSPServer.cpp b/clang-tools-extra/clangd/ClangdLSPServer.cpp
index ebd42abd2dd61..65da685be1278 100644
--- a/clang-tools-extra/clangd/ClangdLSPServer.cpp
+++ b/clang-tools-extra/clangd/ClangdLSPServer.cpp
@@ -518,6 +518,7 @@ void ClangdLSPServer::onInitialize(const InitializeParams &Params,
 
   Opts.CodeComplete.EnableSnippets = Params.capabilities.CompletionSnippets;
   Opts.CodeComplete.IncludeFixIts = Params.capabilities.CompletionFixes;
+  Opts.CodeComplete.EnableInsertReplace = Params.capabilities.InsertReplace;
   if (!Opts.CodeComplete.BundleOverloads)
     Opts.CodeComplete.BundleOverloads = Params.capabilities.HasSignatureHelp;
   Opts.CodeComplete.DocumentationFormat =
diff --git a/clang-tools-extra/clangd/CodeComplete.cpp b/clang-tools-extra/clangd/CodeComplete.cpp
index 7c390f9c8219d..c58ef139b7437 100644
--- a/clang-tools-extra/clangd/CodeComplete.cpp
+++ b/clang-tools-extra/clangd/CodeComplete.cpp
@@ -1382,14 +1382,17 @@ void loadMainFilePreambleMacros(const Preprocessor &PP,
 bool semaCodeComplete(std::unique_ptr<CodeCompleteConsumer> Consumer,
                       const clang::CodeCompleteOptions &Options,
                       const SemaCompleteInput &Input,
-                      IncludeStructure *Includes = nullptr) {
+                      IncludeStructure *Includes = nullptr,
+                      std::unique_ptr<CompilerInvocation> CI = nullptr) {
   trace::Span Tracer("Sema completion");
 
   IgnoreDiagnostics IgnoreDiags;
-  auto CI = buildCompilerInvocation(Input.ParseInput, IgnoreDiags);
   if (!CI) {
-    elog("Couldn't create CompilerInvocation");
-    return false;
+    CI = buildCompilerInvocation(Input.ParseInput, IgnoreDiags);
+    if (!CI) {
+      elog("Couldn't create CompilerInvocation");
+      return false;
+    }
   }
   auto &FrontendOpts = CI->getFrontendOpts();
   FrontendOpts.SkipFunctionBodies = true;
@@ -1615,7 +1618,8 @@ class CodeCompleteFlow {
   bool Incomplete = false; // Would more be available with a higher limit?
   CompletionPrefix HeuristicPrefix;
   std::optional<FuzzyMatcher> Filter; // Initialized once Sema runs.
-  Range ReplacedRange;
+  Range InsertRange;
+  std::optional<Range> ReplaceRange;
   std::vector<std::string> QueryScopes;      // Initialized once Sema runs.
   std::vector<std::string> AccessibleScopes; // Initialized once Sema runs.
   // Initialized once QueryScopes is initialized, if there are scopes.
@@ -1750,8 +1754,19 @@ class CodeCompleteFlow {
     IsUsingDeclaration = false;
     Filter = FuzzyMatcher(HeuristicPrefix.Name);
     auto Pos = offsetToPosition(Content, Offset);
-    ReplacedRange.start = ReplacedRange.end = Pos;
-    ReplacedRange.start.character -= HeuristicPrefix.Name.size();
+    InsertRange.start = InsertRange.end = Pos;
+    InsertRange.start.character -= HeuristicPrefix.Name.size();
+
+    if (Opts.EnableInsertReplace) {
+      ReplaceRange.emplace();
+      ReplaceRange->start = InsertRange.start;
+      // Scan forward past ASCII identifier characters to find replace end.
+      size_t ReplaceEnd = Offset;
+      while (ReplaceEnd < Content.size() &&
+             isAsciiIdentifierContinue(Content[ReplaceEnd]))
+        ++ReplaceEnd;
+      ReplaceRange->end = offsetToPosition(Content, ReplaceEnd);
+    }
 
     llvm::StringMap<SourceParams> ProxSources;
     ProxSources[FileName].Cost = 0;
@@ -1832,19 +1847,26 @@ class CodeCompleteFlow {
   CodeCompleteResult runWithSema() {
     const auto &CodeCompletionRange = CharSourceRange::getCharRange(
         Recorder->CCSema->getPreprocessor().getCodeCompletionTokenRange());
+
+    const SourceManager &SM = Recorder->CCSema->getSourceManager();
+
     // When we are getting completions with an empty identifier, for example
     //    std::vector<int> asdf;
     //    asdf.^;
     // Then the range will be invalid and we will be doing insertion, use
     // current cursor position in such cases as range.
     if (CodeCompletionRange.isValid()) {
-      ReplacedRange = halfOpenToRange(Recorder->CCSema->getSourceManager(),
-                                      CodeCompletionRange);
+      InsertRange = halfOpenToRange(SM, CodeCompletionRange);
     } else {
       const auto &Pos = sourceLocToPosition(
-          Recorder->CCSema->getSourceManager(),
-          Recorder->CCSema->getPreprocessor().getCodeCompletionLoc());
-      ReplacedRange.start = ReplacedRange.end = Pos;
+          SM, Recorder->CCSema->getPreprocessor().getCodeCompletionLoc());
+      InsertRange.start = InsertRange.end = Pos;
+    }
+
+    if (Opts.EnableInsertReplace) {
+      ReplaceRange.emplace();
+      ReplaceRange->start = InsertRange.start;
+      ReplaceRange->end = getEndOfCodeCompletionReplace(SM);
     }
     Filter = FuzzyMatcher(
         Recorder->CCSema->getPreprocessor().getCodeCompletionFilter());
@@ -1874,6 +1896,26 @@ class CodeCompleteFlow {
     return toCodeCompleteResult(Top);
   }
 
+  // Returns the LSP position at the end of the identifier suffix after the
+  // code completion cursor.
+  Position getEndOfCodeCompletionReplace(const SourceManager &SM) {
+    const Preprocessor &PP = Recorder->CCSema->getPreprocessor();
+    const LangOptions &LangOpts = Recorder->CCSema->getLangOpts();
+
+    // Skip past the code completion NUL byte and scan forward through
+    // identifier continuation characters (letters, digits, _, $, UCN,
+    // unicode). This handles all cases uniformly: with prefix ("vac^1abc"),
+    // without prefix ("vec.^asdf"), and digit-starting ("vec.^1abc").
+    const SourceLocation SuffixBegin =
+        PP.getCodeCompletionLoc().getLocWithOffset(1);
+    Position End = sourceLocToPosition(
+        SM, Lexer::findEndOfIdentifierContinuation(SuffixBegin, SM, LangOpts));
+    // Adjust for the NUL byte inserted at the cursor by code completion,
+    // which inflates the column by 1.
+    End.character--;
+    return End;
+  }
+
   CodeCompleteResult
   toCodeCompleteResult(const std::vector<ScoredBundle> &Scored) {
     CodeCompleteResult Output;
@@ -1885,7 +1927,8 @@ class CodeCompleteFlow {
     for (auto &C : Scored) {
       Output.Completions.push_back(toCodeCompletion(C.first));
       Output.Completions.back().Score = C.second;
-      Output.Completions.back().CompletionTokenRange = ReplacedRange;
+      Output.Completions.back().CompletionInsertRange = InsertRange;
+      Output.Completions.back().CompletionReplaceRange = ReplaceRange;
       if (Opts.Index && !Output.Completions.back().Documentation) {
         for (auto &Cand : C.first) {
           if (Cand.SemaResult &&
@@ -1909,7 +1952,8 @@ class CodeCompleteFlow {
     }
     Output.HasMore = Incomplete;
     Output.Context = CCContextKind;
-    Output.CompletionRange = ReplacedRange;
+    Output.InsertRange = InsertRange;
+    Output.ReplaceRange = ReplaceRange;
 
     // Look up documentation from the index.
     if (Opts.Index) {
@@ -2230,16 +2274,54 @@ CompletionPrefix guessCompletionPrefix(llvm::StringRef Content,
   return Result;
 }
 
+// If Offset is inside what looks like argument comment (e.g.
+// "/*^*/" or "/* foo = ^*/"), returns the offset pointing past the closing "*/".
+static std::optional<unsigned>
+maybeFunctionArgumentCommentEnd(const PathRef FileName, const unsigned Offset,
+                                const llvm::StringRef Content,
+                                const LangOptions &LangOpts) {
+  if (Offset > Content.size())
+    return std::nullopt;
+
+  SourceManagerForFile FileSM(FileName, Content);
+  const SourceManager &SM = FileSM.get();
+  const SourceLocation Cursor =
+      SM.getComposedLoc(SM.getMainFileID(), static_cast<unsigned>(Offset));
+  const SourceLocation EndOfSuffix =
+      Lexer::findEndOfIdentifierContinuation(Cursor, SM, LangOpts);
+  const unsigned EndOfSuffixOffset = SM.getFileOffset(EndOfSuffix);
+
+  const llvm::StringRef Rest = Content.drop_front(EndOfSuffixOffset);
+  llvm::StringRef RestTrimmed = Rest.ltrim();
+  // Comment argument pattern: `/* name = */` — skip past optional `=`.
+  if (RestTrimmed.starts_with("="))
+    RestTrimmed = RestTrimmed.drop_front(1).ltrim();
+  if (RestTrimmed.starts_with("*/"))
+    return EndOfSuffixOffset + (Rest.size() - RestTrimmed.size()) + 2;
+  return std::nullopt;
+}
+
 // Code complete the argument name on "/*" inside function call.
-// Offset should be pointing to the start of the comment, i.e.:
+// OutsideStartOffset should be pointing before the comment, i.e.:
 // foo(^/*, rather than foo(/*^) where the cursor probably is.
-CodeCompleteResult codeCompleteComment(PathRef FileName, unsigned Offset,
-                                       llvm::StringRef Prefix,
-                                       const PreambleData *Preamble,
-                                       const ParseInputs &ParseInput) {
+CodeCompleteResult
+codeCompleteComment(PathRef FileName, const unsigned CursorOffset,
+                    unsigned OutsideStartOffset, llvm::StringRef Prefix,
+                    const PreambleData *Preamble, const ParseInputs &ParseInput,
+                    const CodeCompleteOptions &Opts) {
   if (Preamble == nullptr) // Can't run without Sema.
     return CodeCompleteResult();
 
+  IgnoreDiagnostics IgnoreDiags;
+  auto CI = buildCompilerInvocation(ParseInput, IgnoreDiags);
+  if (!CI)
+    return CodeCompleteResult();
+
+  std::optional<unsigned> OutsideEndOffset;
+  if (Opts.EnableInsertReplace)
+    OutsideEndOffset = maybeFunctionArgumentCommentEnd(
+        FileName, CursorOffset, ParseInput.Contents, CI->getLangOpts());
+
   clang::CodeCompleteOptions Options;
   Options.IncludeGlobals = false;
   Options.IncludeMacros = false;
@@ -2250,20 +2332,31 @@ CodeCompleteResult codeCompleteComment(PathRef FileName, unsigned Offset,
   // full patch.
   semaCodeComplete(
       std::make_unique<ParamNameCollector>(Options, ParamNames), Options,
-      {FileName, Offset, *Preamble,
+      {FileName, OutsideStartOffset, *Preamble,
        PreamblePatch::createFullPatch(FileName, ParseInput, *Preamble),
-       ParseInput});
+       ParseInput},
+      /*Includes=*/nullptr, std::move(CI));
   if (ParamNames.empty())
     return CodeCompleteResult();
 
   CodeCompleteResult Result;
-  Range CompletionRange;
+  Range InsertRange;
   // Skip /*
-  Offset += 2;
-  CompletionRange.start = offsetToPosition(ParseInput.Contents, Offset);
-  CompletionRange.end =
-      offsetToPosition(ParseInput.Contents, Offset + Prefix.size());
-  Result.CompletionRange = CompletionRange;
+  const unsigned InsideStartOffset = OutsideStartOffset + 2;
+  InsertRange.start = offsetToPosition(ParseInput.Contents, InsideStartOffset);
+  InsertRange.end =
+      offsetToPosition(ParseInput.Contents, InsideStartOffset + Prefix.size());
+  Result.InsertRange = InsertRange;
+
+  if (Opts.EnableInsertReplace) {
+    Range ReplaceRange;
+    ReplaceRange.start = InsertRange.start;
+    ReplaceRange.end = OutsideEndOffset ? offsetToPosition(ParseInput.Contents,
+                                                           *OutsideEndOffset)
+                                        : InsertRange.end;
+    Result.ReplaceRange = ReplaceRange;
+  }
+
   Result.Context = CodeCompletionContext::CCC_NaturalLanguage;
   for (llvm::StringRef Name : ParamNames) {
     if (!Name.starts_with(Prefix))
@@ -2272,7 +2365,8 @@ CodeCompleteResult codeCompleteComment(PathRef FileName, unsigned Offset,
     Item.Name = Name.str() + "=*/";
     Item.FilterText = Item.Name;
     Item.Kind = CompletionItemKind::Text;
-    Item.CompletionTokenRange = CompletionRange;
+    Item.CompletionInsertRange = InsertRange;
+    Item.CompletionReplaceRange = Result.ReplaceRange;
     Item.Origin = SymbolOrigin::AST;
     Result.Completions.push_back(Item);
   }
@@ -2312,8 +2406,8 @@ CodeCompleteResult codeComplete(PathRef FileName, Position Pos,
     // parsing, so we must move back the position before running it, extract
     // information we need and construct completion items ourselves.
     auto CommentPrefix = Content.substr(*OffsetBeforeComment + 2).trim();
-    return codeCompleteComment(FileName, *OffsetBeforeComment, CommentPrefix,
-                               Preamble, ParseInput);
+    return codeCompleteComment(FileName, *Offset, *OffsetBeforeComment,
+                               CommentPrefix, Preamble, ParseInput, Opts);
   }
 
   auto Flow = CodeCompleteFlow(
@@ -2423,7 +2517,9 @@ CompletionItem CodeCompletion::render(const CodeCompleteOptions &Opts) const {
   }
   LSP.sortText = sortText(Score.Total, FilterText);
   LSP.filterText = FilterText;
-  LSP.textEdit = {CompletionTokenRange, RequiredQualifier + Name, ""};
+  TextEdit Edit;
+  Edit.range = CompletionInsertRange;
+  Edit.newText = RequiredQualifier + Name;
   // Merge continuous additionalTextEdits into main edit. The main motivation
   // behind this is to help LSP clients, it seems most of them are confused when
   // they are provided with additionalTextEdits that are consecutive to main
@@ -2432,19 +2528,34 @@ CompletionItem CodeCompletion::render(const CodeCompleteOptions &Opts) const {
   // is mainly to help LSP clients again, so that changes do not effect each
   // other.
   for (const auto &FixIt : FixIts) {
-    if (FixIt.range.end == LSP.textEdit->range.start) {
-      LSP.textEdit->newText = FixIt.newText + LSP.textEdit->newText;
-      LSP.textEdit->range.start = FixIt.range.start;
+    if (FixIt.range.end == Edit.range.start) {
+      Edit.newText = FixIt.newText + Edit.newText;
+      Edit.range.start = FixIt.range.start;
     } else {
       LSP.additionalTextEdits.push_back(FixIt);
     }
   }
   if (Opts.EnableSnippets)
-    LSP.textEdit->newText += SnippetSuffix;
+    Edit.newText += SnippetSuffix;
 
   // FIXME(kadircet): Do not even fill insertText after making sure textEdit is
   // compatible with most of the editors.
-  LSP.insertText = LSP.textEdit->newText;
+  LSP.insertText = Edit.newText;
+  if (Opts.EnableInsertReplace) {
+    assert(CompletionReplaceRange &&
+           "CompletionReplaceRange must be already set before render() "
+           "when EnableInsertReplace is on");
+    InsertReplaceEdit IRE;
+    IRE.newText = std::move(Edit.newText);
+    IRE.insert = Edit.range;
+    IRE.replace = *CompletionReplaceRange;
+    // FixIt merging may have extended the insert range start; keep replace
+    // range as a superset per LSP spec.
+    IRE.replace.start = IRE.insert.start;
+    LSP.textEdit = std::move(IRE);
+  } else {
+    LSP.textEdit = std::move(Edit);
+  }
   // Some clients support snippets but work better with plaintext.
   // So if the snippet is trivial, let the client know.
   // https://github.com/clangd/clangd/issues/922
diff --git a/clang-tools-extra/clangd/CodeComplete.h b/clang-tools-extra/clangd/CodeComplete.h
index cde22a8212e6a..99ae48c30907c 100644
--- a/clang-tools-extra/clangd/CodeComplete.h
+++ b/clang-tools-extra/clangd/CodeComplete.h
@@ -71,6 +71,10 @@ struct CodeCompleteOptions {
   /// Whether to present doc comments as plain-text or markdown.
   MarkupKind DocumentationFormat = MarkupKind::PlainText;
 
+  /// Whether to present the completion as a single textEdit range or as two
+  /// ranges (insert/replace).
+  bool EnableInsertReplace = false;
+
   Config::HeaderInsertionPolicy InsertIncludes =
       Config::HeaderInsertionPolicy::IWYU;
 
@@ -222,7 +226,9 @@ struct CodeCompletion {
   std::vector<TextEdit> FixIts;
 
   /// Holds the range of the token we are going to replace with this completion.
-  Range CompletionTokenRange;
+  Range CompletionInsertRange;
+  /// If set, the range to use when the client's insert mode is "replace".
+  std::optional<Range> CompletionReplaceRange;
 
   // Scores are used to rank completion items.
   struct Scores {
@@ -261,8 +267,12 @@ struct CodeCompleteResult {
   // The text that is being directly completed.
   // Example: foo.pb^ -> foo.push_back()
   //              ~~
-  // Typically matches the textEdit.range of Completions, but not guaranteed to.
-  std::optional<Range> CompletionRange;
+  // Typically matches the textEdit.range (or textEdit.insert range) of
+  // Completions, but not guaranteed to.
+  std::optional<Range> InsertRange;
+  // If not empty, typically matches the textEdit.replace range of Completions,
+  // but not guaranteed to.
+  std::optional<Range> ReplaceRange;
   // Usually the source will be parsed with a real C++ parser.
   // But heuristics may be used instead if e.g. the preamble is not ready.
   bool RanParser = true;
diff --git a/clang-tools-extra/clangd/Protocol.cpp b/clang-tools-extra/clangd/Protocol.cpp
index 793db7b052990..f77b0773d445a 100644
--- a/clang-tools-extra/clangd/Protocol.cpp
+++ b/clang-tools-extra/clangd/Protocol.cpp
@@ -202,6 +202,14 @@ llvm::json::Value toJSON(const TextEdit &P) {
   return Result;
 }
 
+llvm::json::Value toJSON(const InsertReplaceEdit &P) {
+  return llvm::json::Object{
+      {"newText", P.newText},
+      {"insert", P.insert},
+      {"replace", P.replace},
+  };
+}
+
 bool fromJSON(const llvm::json::Value &Params, ChangeAnnotation &R,
               llvm::json::Path P) {
   llvm::json::ObjectMapper O(Params, P);
@@ -414,6 +422,8 @@ bool fromJSON(const llvm::json::Value &Params, ClientCapabilities &R,
               break;
           }
         }
+        if (auto IRSupport = Item->getBoolean("insertReplaceSupport"))
+          R.InsertReplace = *IRSupport;
       }
       if (auto *ItemKind = Completion->getObject("completionItemKind")) {
         if (auto *ValueSet = ItemKind->get("valueSet")) {
@@ -1184,7 +1194,8 @@ llvm::json::Value toJSON(const CompletionItem &CI) {
   if (CI.insertTextFormat != InsertTextFormat::Missing)
     Result["insertTextFormat"] = static_cast<int>(CI.insertTextFormat);
   if (CI.textEdit)
-    Result["textEdit"] = *CI.textEdit;
+    Result["textEdit"] = std::visit(
+        [](const auto &V) { return llvm::json::Value(V); }, *CI.textEdit);
   if (!CI.additionalTextEdits.empty())
     Result["additionalTextEdits"] = llvm::json::Array(CI.additionalTextEdits);
   if (CI.deprecated)
diff --git a/clang-tools-extra/clangd/Protocol.h b/clang-tools-extra/clangd/Protocol.h
index 7a99721a1e856..9c1bb9d9bb059 100644
--- a/clang-tools-extra/clangd/Protocol.h
+++ b/clang-tools-extra/clangd/Protocol.h
@@ -34,6 +34,7 @@
 #include <memory>
 #include <optional>
 #include <string>
+#include <variant>
 #include <vector>
 
 // This file is using the LSP syntax for identifier names which is different
@@ -261,6 +262,18 @@ bool fromJSON(const llvm::json::Value &, TextEdit &, llvm::json::Path);
 llvm::json::Value toJSON(const TextEdit &);
 llvm::raw_ostream &operator<<(llvm::raw_ostream &, const TextEdit &);
 
+struct InsertReplaceEdit {
+  /// The string to be inserted.
+  std::string newText;
+
+  /// The range if the insert is requested.
+  Range insert;
+
+  /// The range if the replace is requested.
+  Range replace;
+};
+llvm::json::Value toJSON(const InsertReplaceEdit &);
+
 struct ChangeAnnotation {
   /// A human-readable string describing the actual change. The string
   /// is rendered prominent in the user interface.
@@ -510,6 +523,10 @@ struct ClientCapabilities {
   /// textDocument.completion.completionItem.documentationFormat
   MarkupKind CompletionDocumentationFormat = MarkupKind::PlainText;
 
+  /// Client supports insert replace edit to control different behavior if a
+  /// completion item is inserted in the text or should replace text.
+  bool InsertReplace = false;
+
   /// The client has support for completion item label details.
   /// textDocument.completion.completionItem.labelDetailsSupport.
   bool CompletionLabelDetail = false;
@@ -1372,9 +1389,13 @@ struct CompletionItem {
   /// An edit which is applied to a document when selecting this completion.
   /// When an edit is provided `insertText` is ignored.
   ///
-  /// Note: The range of the edit must be a single line range and it must
-  /// contain the position at which completion has been requested.
-  std::optional<TextEdit> textEdit;
+  /// Note 1: The text edit's range as well as both ranges from an insert
+  /// replace edit must be a single line range and must contain the position
+  /// at which completion has been requested.
+  /// Note 2: If an `InsertReplaceEdit` is returned, the edit's insert range
+  /// must be a prefix of the edit's replace range, meaning it must be
+  /// contained in and starting at the same position.
+  std:...
[truncated]

@llvmbot

llvmbot commented Mar 20, 2026

Copy link
Copy Markdown
Member

@llvm/pr-subscribers-clang-tools-extra

Author: None (argothiel)

Changes

Handle new insertReplaceSupport capability (LSP 3.16). Add the new option to the protocol layer and pass it around to the code completion logic. Update CompletionItem::textEdit to become the union type as per the LSP specification.

Add a new helper function to the Lexer to find the end of the identifier with full context lexing, to avoid duplicating the logic. Use the helper both in Sema flow, and in the comment completion flow. Use a simpler ASCII-only scan in no-compile mode.

Add LIT tests to verify auto-triggered completions, mid-word replacement, unicode, and snippets. Add unit tests to verify insert/replace ranges with and without Sema, including comments and the feature-off case.

Update the release notes to document the new capability.

Fixes clangd/clangd#2190


Patch is 48.84 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/187623.diff

13 Files Affected:

  • (modified) clang-tools-extra/clangd/ClangdLSPServer.cpp (+1)
  • (modified) clang-tools-extra/clangd/CodeComplete.cpp (+147-36)
  • (modified) clang-tools-extra/clangd/CodeComplete.h (+13-3)
  • (modified) clang-tools-extra/clangd/Protocol.cpp (+12-1)
  • (modified) clang-tools-extra/clangd/Protocol.h (+24-3)
  • (added) clang-tools-extra/clangd/test/completion-auto-trigger-replace.test (+113)
  • (added) clang-tools-extra/clangd/test/completion-replace.test (+166)
  • (added) clang-tools-extra/clangd/test/completion-snippets-replace.test (+68)
  • (modified) clang-tools-extra/clangd/unittests/CodeCompleteTests.cpp (+139-15)
  • (modified) clang-tools-extra/docs/ReleaseNotes.rst (+4)
  • (modified) clang/include/clang/Lex/Lexer.h (+8)
  • (modified) clang/lib/Lex/Lexer.cpp (+22)
  • (modified) clang/unittests/Lex/LexerTest.cpp (+34)
diff --git a/clang-tools-extra/clangd/ClangdLSPServer.cpp b/clang-tools-extra/clangd/ClangdLSPServer.cpp
index ebd42abd2dd61..65da685be1278 100644
--- a/clang-tools-extra/clangd/ClangdLSPServer.cpp
+++ b/clang-tools-extra/clangd/ClangdLSPServer.cpp
@@ -518,6 +518,7 @@ void ClangdLSPServer::onInitialize(const InitializeParams &Params,
 
   Opts.CodeComplete.EnableSnippets = Params.capabilities.CompletionSnippets;
   Opts.CodeComplete.IncludeFixIts = Params.capabilities.CompletionFixes;
+  Opts.CodeComplete.EnableInsertReplace = Params.capabilities.InsertReplace;
   if (!Opts.CodeComplete.BundleOverloads)
     Opts.CodeComplete.BundleOverloads = Params.capabilities.HasSignatureHelp;
   Opts.CodeComplete.DocumentationFormat =
diff --git a/clang-tools-extra/clangd/CodeComplete.cpp b/clang-tools-extra/clangd/CodeComplete.cpp
index 7c390f9c8219d..c58ef139b7437 100644
--- a/clang-tools-extra/clangd/CodeComplete.cpp
+++ b/clang-tools-extra/clangd/CodeComplete.cpp
@@ -1382,14 +1382,17 @@ void loadMainFilePreambleMacros(const Preprocessor &PP,
 bool semaCodeComplete(std::unique_ptr<CodeCompleteConsumer> Consumer,
                       const clang::CodeCompleteOptions &Options,
                       const SemaCompleteInput &Input,
-                      IncludeStructure *Includes = nullptr) {
+                      IncludeStructure *Includes = nullptr,
+                      std::unique_ptr<CompilerInvocation> CI = nullptr) {
   trace::Span Tracer("Sema completion");
 
   IgnoreDiagnostics IgnoreDiags;
-  auto CI = buildCompilerInvocation(Input.ParseInput, IgnoreDiags);
   if (!CI) {
-    elog("Couldn't create CompilerInvocation");
-    return false;
+    CI = buildCompilerInvocation(Input.ParseInput, IgnoreDiags);
+    if (!CI) {
+      elog("Couldn't create CompilerInvocation");
+      return false;
+    }
   }
   auto &FrontendOpts = CI->getFrontendOpts();
   FrontendOpts.SkipFunctionBodies = true;
@@ -1615,7 +1618,8 @@ class CodeCompleteFlow {
   bool Incomplete = false; // Would more be available with a higher limit?
   CompletionPrefix HeuristicPrefix;
   std::optional<FuzzyMatcher> Filter; // Initialized once Sema runs.
-  Range ReplacedRange;
+  Range InsertRange;
+  std::optional<Range> ReplaceRange;
   std::vector<std::string> QueryScopes;      // Initialized once Sema runs.
   std::vector<std::string> AccessibleScopes; // Initialized once Sema runs.
   // Initialized once QueryScopes is initialized, if there are scopes.
@@ -1750,8 +1754,19 @@ class CodeCompleteFlow {
     IsUsingDeclaration = false;
     Filter = FuzzyMatcher(HeuristicPrefix.Name);
     auto Pos = offsetToPosition(Content, Offset);
-    ReplacedRange.start = ReplacedRange.end = Pos;
-    ReplacedRange.start.character -= HeuristicPrefix.Name.size();
+    InsertRange.start = InsertRange.end = Pos;
+    InsertRange.start.character -= HeuristicPrefix.Name.size();
+
+    if (Opts.EnableInsertReplace) {
+      ReplaceRange.emplace();
+      ReplaceRange->start = InsertRange.start;
+      // Scan forward past ASCII identifier characters to find replace end.
+      size_t ReplaceEnd = Offset;
+      while (ReplaceEnd < Content.size() &&
+             isAsciiIdentifierContinue(Content[ReplaceEnd]))
+        ++ReplaceEnd;
+      ReplaceRange->end = offsetToPosition(Content, ReplaceEnd);
+    }
 
     llvm::StringMap<SourceParams> ProxSources;
     ProxSources[FileName].Cost = 0;
@@ -1832,19 +1847,26 @@ class CodeCompleteFlow {
   CodeCompleteResult runWithSema() {
     const auto &CodeCompletionRange = CharSourceRange::getCharRange(
         Recorder->CCSema->getPreprocessor().getCodeCompletionTokenRange());
+
+    const SourceManager &SM = Recorder->CCSema->getSourceManager();
+
     // When we are getting completions with an empty identifier, for example
     //    std::vector<int> asdf;
     //    asdf.^;
     // Then the range will be invalid and we will be doing insertion, use
     // current cursor position in such cases as range.
     if (CodeCompletionRange.isValid()) {
-      ReplacedRange = halfOpenToRange(Recorder->CCSema->getSourceManager(),
-                                      CodeCompletionRange);
+      InsertRange = halfOpenToRange(SM, CodeCompletionRange);
     } else {
       const auto &Pos = sourceLocToPosition(
-          Recorder->CCSema->getSourceManager(),
-          Recorder->CCSema->getPreprocessor().getCodeCompletionLoc());
-      ReplacedRange.start = ReplacedRange.end = Pos;
+          SM, Recorder->CCSema->getPreprocessor().getCodeCompletionLoc());
+      InsertRange.start = InsertRange.end = Pos;
+    }
+
+    if (Opts.EnableInsertReplace) {
+      ReplaceRange.emplace();
+      ReplaceRange->start = InsertRange.start;
+      ReplaceRange->end = getEndOfCodeCompletionReplace(SM);
     }
     Filter = FuzzyMatcher(
         Recorder->CCSema->getPreprocessor().getCodeCompletionFilter());
@@ -1874,6 +1896,26 @@ class CodeCompleteFlow {
     return toCodeCompleteResult(Top);
   }
 
+  // Returns the LSP position at the end of the identifier suffix after the
+  // code completion cursor.
+  Position getEndOfCodeCompletionReplace(const SourceManager &SM) {
+    const Preprocessor &PP = Recorder->CCSema->getPreprocessor();
+    const LangOptions &LangOpts = Recorder->CCSema->getLangOpts();
+
+    // Skip past the code completion NUL byte and scan forward through
+    // identifier continuation characters (letters, digits, _, $, UCN,
+    // unicode). This handles all cases uniformly: with prefix ("vac^1abc"),
+    // without prefix ("vec.^asdf"), and digit-starting ("vec.^1abc").
+    const SourceLocation SuffixBegin =
+        PP.getCodeCompletionLoc().getLocWithOffset(1);
+    Position End = sourceLocToPosition(
+        SM, Lexer::findEndOfIdentifierContinuation(SuffixBegin, SM, LangOpts));
+    // Adjust for the NUL byte inserted at the cursor by code completion,
+    // which inflates the column by 1.
+    End.character--;
+    return End;
+  }
+
   CodeCompleteResult
   toCodeCompleteResult(const std::vector<ScoredBundle> &Scored) {
     CodeCompleteResult Output;
@@ -1885,7 +1927,8 @@ class CodeCompleteFlow {
     for (auto &C : Scored) {
       Output.Completions.push_back(toCodeCompletion(C.first));
       Output.Completions.back().Score = C.second;
-      Output.Completions.back().CompletionTokenRange = ReplacedRange;
+      Output.Completions.back().CompletionInsertRange = InsertRange;
+      Output.Completions.back().CompletionReplaceRange = ReplaceRange;
       if (Opts.Index && !Output.Completions.back().Documentation) {
         for (auto &Cand : C.first) {
           if (Cand.SemaResult &&
@@ -1909,7 +1952,8 @@ class CodeCompleteFlow {
     }
     Output.HasMore = Incomplete;
     Output.Context = CCContextKind;
-    Output.CompletionRange = ReplacedRange;
+    Output.InsertRange = InsertRange;
+    Output.ReplaceRange = ReplaceRange;
 
     // Look up documentation from the index.
     if (Opts.Index) {
@@ -2230,16 +2274,54 @@ CompletionPrefix guessCompletionPrefix(llvm::StringRef Content,
   return Result;
 }
 
+// If Offset is inside what looks like argument comment (e.g.
+// "/*^*/" or "/* foo = ^*/"), returns the offset pointing past the closing "*/".
+static std::optional<unsigned>
+maybeFunctionArgumentCommentEnd(const PathRef FileName, const unsigned Offset,
+                                const llvm::StringRef Content,
+                                const LangOptions &LangOpts) {
+  if (Offset > Content.size())
+    return std::nullopt;
+
+  SourceManagerForFile FileSM(FileName, Content);
+  const SourceManager &SM = FileSM.get();
+  const SourceLocation Cursor =
+      SM.getComposedLoc(SM.getMainFileID(), static_cast<unsigned>(Offset));
+  const SourceLocation EndOfSuffix =
+      Lexer::findEndOfIdentifierContinuation(Cursor, SM, LangOpts);
+  const unsigned EndOfSuffixOffset = SM.getFileOffset(EndOfSuffix);
+
+  const llvm::StringRef Rest = Content.drop_front(EndOfSuffixOffset);
+  llvm::StringRef RestTrimmed = Rest.ltrim();
+  // Comment argument pattern: `/* name = */` — skip past optional `=`.
+  if (RestTrimmed.starts_with("="))
+    RestTrimmed = RestTrimmed.drop_front(1).ltrim();
+  if (RestTrimmed.starts_with("*/"))
+    return EndOfSuffixOffset + (Rest.size() - RestTrimmed.size()) + 2;
+  return std::nullopt;
+}
+
 // Code complete the argument name on "/*" inside function call.
-// Offset should be pointing to the start of the comment, i.e.:
+// OutsideStartOffset should be pointing before the comment, i.e.:
 // foo(^/*, rather than foo(/*^) where the cursor probably is.
-CodeCompleteResult codeCompleteComment(PathRef FileName, unsigned Offset,
-                                       llvm::StringRef Prefix,
-                                       const PreambleData *Preamble,
-                                       const ParseInputs &ParseInput) {
+CodeCompleteResult
+codeCompleteComment(PathRef FileName, const unsigned CursorOffset,
+                    unsigned OutsideStartOffset, llvm::StringRef Prefix,
+                    const PreambleData *Preamble, const ParseInputs &ParseInput,
+                    const CodeCompleteOptions &Opts) {
   if (Preamble == nullptr) // Can't run without Sema.
     return CodeCompleteResult();
 
+  IgnoreDiagnostics IgnoreDiags;
+  auto CI = buildCompilerInvocation(ParseInput, IgnoreDiags);
+  if (!CI)
+    return CodeCompleteResult();
+
+  std::optional<unsigned> OutsideEndOffset;
+  if (Opts.EnableInsertReplace)
+    OutsideEndOffset = maybeFunctionArgumentCommentEnd(
+        FileName, CursorOffset, ParseInput.Contents, CI->getLangOpts());
+
   clang::CodeCompleteOptions Options;
   Options.IncludeGlobals = false;
   Options.IncludeMacros = false;
@@ -2250,20 +2332,31 @@ CodeCompleteResult codeCompleteComment(PathRef FileName, unsigned Offset,
   // full patch.
   semaCodeComplete(
       std::make_unique<ParamNameCollector>(Options, ParamNames), Options,
-      {FileName, Offset, *Preamble,
+      {FileName, OutsideStartOffset, *Preamble,
        PreamblePatch::createFullPatch(FileName, ParseInput, *Preamble),
-       ParseInput});
+       ParseInput},
+      /*Includes=*/nullptr, std::move(CI));
   if (ParamNames.empty())
     return CodeCompleteResult();
 
   CodeCompleteResult Result;
-  Range CompletionRange;
+  Range InsertRange;
   // Skip /*
-  Offset += 2;
-  CompletionRange.start = offsetToPosition(ParseInput.Contents, Offset);
-  CompletionRange.end =
-      offsetToPosition(ParseInput.Contents, Offset + Prefix.size());
-  Result.CompletionRange = CompletionRange;
+  const unsigned InsideStartOffset = OutsideStartOffset + 2;
+  InsertRange.start = offsetToPosition(ParseInput.Contents, InsideStartOffset);
+  InsertRange.end =
+      offsetToPosition(ParseInput.Contents, InsideStartOffset + Prefix.size());
+  Result.InsertRange = InsertRange;
+
+  if (Opts.EnableInsertReplace) {
+    Range ReplaceRange;
+    ReplaceRange.start = InsertRange.start;
+    ReplaceRange.end = OutsideEndOffset ? offsetToPosition(ParseInput.Contents,
+                                                           *OutsideEndOffset)
+                                        : InsertRange.end;
+    Result.ReplaceRange = ReplaceRange;
+  }
+
   Result.Context = CodeCompletionContext::CCC_NaturalLanguage;
   for (llvm::StringRef Name : ParamNames) {
     if (!Name.starts_with(Prefix))
@@ -2272,7 +2365,8 @@ CodeCompleteResult codeCompleteComment(PathRef FileName, unsigned Offset,
     Item.Name = Name.str() + "=*/";
     Item.FilterText = Item.Name;
     Item.Kind = CompletionItemKind::Text;
-    Item.CompletionTokenRange = CompletionRange;
+    Item.CompletionInsertRange = InsertRange;
+    Item.CompletionReplaceRange = Result.ReplaceRange;
     Item.Origin = SymbolOrigin::AST;
     Result.Completions.push_back(Item);
   }
@@ -2312,8 +2406,8 @@ CodeCompleteResult codeComplete(PathRef FileName, Position Pos,
     // parsing, so we must move back the position before running it, extract
     // information we need and construct completion items ourselves.
     auto CommentPrefix = Content.substr(*OffsetBeforeComment + 2).trim();
-    return codeCompleteComment(FileName, *OffsetBeforeComment, CommentPrefix,
-                               Preamble, ParseInput);
+    return codeCompleteComment(FileName, *Offset, *OffsetBeforeComment,
+                               CommentPrefix, Preamble, ParseInput, Opts);
   }
 
   auto Flow = CodeCompleteFlow(
@@ -2423,7 +2517,9 @@ CompletionItem CodeCompletion::render(const CodeCompleteOptions &Opts) const {
   }
   LSP.sortText = sortText(Score.Total, FilterText);
   LSP.filterText = FilterText;
-  LSP.textEdit = {CompletionTokenRange, RequiredQualifier + Name, ""};
+  TextEdit Edit;
+  Edit.range = CompletionInsertRange;
+  Edit.newText = RequiredQualifier + Name;
   // Merge continuous additionalTextEdits into main edit. The main motivation
   // behind this is to help LSP clients, it seems most of them are confused when
   // they are provided with additionalTextEdits that are consecutive to main
@@ -2432,19 +2528,34 @@ CompletionItem CodeCompletion::render(const CodeCompleteOptions &Opts) const {
   // is mainly to help LSP clients again, so that changes do not effect each
   // other.
   for (const auto &FixIt : FixIts) {
-    if (FixIt.range.end == LSP.textEdit->range.start) {
-      LSP.textEdit->newText = FixIt.newText + LSP.textEdit->newText;
-      LSP.textEdit->range.start = FixIt.range.start;
+    if (FixIt.range.end == Edit.range.start) {
+      Edit.newText = FixIt.newText + Edit.newText;
+      Edit.range.start = FixIt.range.start;
     } else {
       LSP.additionalTextEdits.push_back(FixIt);
     }
   }
   if (Opts.EnableSnippets)
-    LSP.textEdit->newText += SnippetSuffix;
+    Edit.newText += SnippetSuffix;
 
   // FIXME(kadircet): Do not even fill insertText after making sure textEdit is
   // compatible with most of the editors.
-  LSP.insertText = LSP.textEdit->newText;
+  LSP.insertText = Edit.newText;
+  if (Opts.EnableInsertReplace) {
+    assert(CompletionReplaceRange &&
+           "CompletionReplaceRange must be already set before render() "
+           "when EnableInsertReplace is on");
+    InsertReplaceEdit IRE;
+    IRE.newText = std::move(Edit.newText);
+    IRE.insert = Edit.range;
+    IRE.replace = *CompletionReplaceRange;
+    // FixIt merging may have extended the insert range start; keep replace
+    // range as a superset per LSP spec.
+    IRE.replace.start = IRE.insert.start;
+    LSP.textEdit = std::move(IRE);
+  } else {
+    LSP.textEdit = std::move(Edit);
+  }
   // Some clients support snippets but work better with plaintext.
   // So if the snippet is trivial, let the client know.
   // https://github.com/clangd/clangd/issues/922
diff --git a/clang-tools-extra/clangd/CodeComplete.h b/clang-tools-extra/clangd/CodeComplete.h
index cde22a8212e6a..99ae48c30907c 100644
--- a/clang-tools-extra/clangd/CodeComplete.h
+++ b/clang-tools-extra/clangd/CodeComplete.h
@@ -71,6 +71,10 @@ struct CodeCompleteOptions {
   /// Whether to present doc comments as plain-text or markdown.
   MarkupKind DocumentationFormat = MarkupKind::PlainText;
 
+  /// Whether to present the completion as a single textEdit range or as two
+  /// ranges (insert/replace).
+  bool EnableInsertReplace = false;
+
   Config::HeaderInsertionPolicy InsertIncludes =
       Config::HeaderInsertionPolicy::IWYU;
 
@@ -222,7 +226,9 @@ struct CodeCompletion {
   std::vector<TextEdit> FixIts;
 
   /// Holds the range of the token we are going to replace with this completion.
-  Range CompletionTokenRange;
+  Range CompletionInsertRange;
+  /// If set, the range to use when the client's insert mode is "replace".
+  std::optional<Range> CompletionReplaceRange;
 
   // Scores are used to rank completion items.
   struct Scores {
@@ -261,8 +267,12 @@ struct CodeCompleteResult {
   // The text that is being directly completed.
   // Example: foo.pb^ -> foo.push_back()
   //              ~~
-  // Typically matches the textEdit.range of Completions, but not guaranteed to.
-  std::optional<Range> CompletionRange;
+  // Typically matches the textEdit.range (or textEdit.insert range) of
+  // Completions, but not guaranteed to.
+  std::optional<Range> InsertRange;
+  // If not empty, typically matches the textEdit.replace range of Completions,
+  // but not guaranteed to.
+  std::optional<Range> ReplaceRange;
   // Usually the source will be parsed with a real C++ parser.
   // But heuristics may be used instead if e.g. the preamble is not ready.
   bool RanParser = true;
diff --git a/clang-tools-extra/clangd/Protocol.cpp b/clang-tools-extra/clangd/Protocol.cpp
index 793db7b052990..f77b0773d445a 100644
--- a/clang-tools-extra/clangd/Protocol.cpp
+++ b/clang-tools-extra/clangd/Protocol.cpp
@@ -202,6 +202,14 @@ llvm::json::Value toJSON(const TextEdit &P) {
   return Result;
 }
 
+llvm::json::Value toJSON(const InsertReplaceEdit &P) {
+  return llvm::json::Object{
+      {"newText", P.newText},
+      {"insert", P.insert},
+      {"replace", P.replace},
+  };
+}
+
 bool fromJSON(const llvm::json::Value &Params, ChangeAnnotation &R,
               llvm::json::Path P) {
   llvm::json::ObjectMapper O(Params, P);
@@ -414,6 +422,8 @@ bool fromJSON(const llvm::json::Value &Params, ClientCapabilities &R,
               break;
           }
         }
+        if (auto IRSupport = Item->getBoolean("insertReplaceSupport"))
+          R.InsertReplace = *IRSupport;
       }
       if (auto *ItemKind = Completion->getObject("completionItemKind")) {
         if (auto *ValueSet = ItemKind->get("valueSet")) {
@@ -1184,7 +1194,8 @@ llvm::json::Value toJSON(const CompletionItem &CI) {
   if (CI.insertTextFormat != InsertTextFormat::Missing)
     Result["insertTextFormat"] = static_cast<int>(CI.insertTextFormat);
   if (CI.textEdit)
-    Result["textEdit"] = *CI.textEdit;
+    Result["textEdit"] = std::visit(
+        [](const auto &V) { return llvm::json::Value(V); }, *CI.textEdit);
   if (!CI.additionalTextEdits.empty())
     Result["additionalTextEdits"] = llvm::json::Array(CI.additionalTextEdits);
   if (CI.deprecated)
diff --git a/clang-tools-extra/clangd/Protocol.h b/clang-tools-extra/clangd/Protocol.h
index 7a99721a1e856..9c1bb9d9bb059 100644
--- a/clang-tools-extra/clangd/Protocol.h
+++ b/clang-tools-extra/clangd/Protocol.h
@@ -34,6 +34,7 @@
 #include <memory>
 #include <optional>
 #include <string>
+#include <variant>
 #include <vector>
 
 // This file is using the LSP syntax for identifier names which is different
@@ -261,6 +262,18 @@ bool fromJSON(const llvm::json::Value &, TextEdit &, llvm::json::Path);
 llvm::json::Value toJSON(const TextEdit &);
 llvm::raw_ostream &operator<<(llvm::raw_ostream &, const TextEdit &);
 
+struct InsertReplaceEdit {
+  /// The string to be inserted.
+  std::string newText;
+
+  /// The range if the insert is requested.
+  Range insert;
+
+  /// The range if the replace is requested.
+  Range replace;
+};
+llvm::json::Value toJSON(const InsertReplaceEdit &);
+
 struct ChangeAnnotation {
   /// A human-readable string describing the actual change. The string
   /// is rendered prominent in the user interface.
@@ -510,6 +523,10 @@ struct ClientCapabilities {
   /// textDocument.completion.completionItem.documentationFormat
   MarkupKind CompletionDocumentationFormat = MarkupKind::PlainText;
 
+  /// Client supports insert replace edit to control different behavior if a
+  /// completion item is inserted in the text or should replace text.
+  bool InsertReplace = false;
+
   /// The client has support for completion item label details.
   /// textDocument.completion.completionItem.labelDetailsSupport.
   bool CompletionLabelDetail = false;
@@ -1372,9 +1389,13 @@ struct CompletionItem {
   /// An edit which is applied to a document when selecting this completion.
   /// When an edit is provided `insertText` is ignored.
   ///
-  /// Note: The range of the edit must be a single line range and it must
-  /// contain the position at which completion has been requested.
-  std::optional<TextEdit> textEdit;
+  /// Note 1: The text edit's range as well as both ranges from an insert
+  /// replace edit must be a single line range and must contain the position
+  /// at which completion has been requested.
+  /// Note 2: If an `InsertReplaceEdit` is returned, the edit's insert range
+  /// must be a prefix of the edit's replace range, meaning it must be
+  /// contained in and starting at the same position.
+  std:...
[truncated]

@argothiel

argothiel commented Mar 20, 2026

Copy link
Copy Markdown
Contributor Author

Ah, worth mentioning: if you want to test it locally, vscode-clangd extension has a bug which crashes the autocompletion in the replace mode unless you set "clangd.serverCompletionRanking": false.

@argothiel

argothiel commented Mar 20, 2026

Copy link
Copy Markdown
Contributor Author

@argothiel

Copy link
Copy Markdown
Contributor Author

@HighCommander4 @kadircet @MythreyaK
Could you take a look when you get a chance?

@github-actions

github-actions Bot commented Mar 20, 2026

Copy link
Copy Markdown

✅ With the latest revision this PR passed the C/C++ code formatter.

@HighCommander4 HighCommander4 self-requested a review March 21, 2026 00:50
@HighCommander4

Copy link
Copy Markdown
Contributor

Hi @argothiel, thanks for the patch, it's appreciated!

I've added myself as a reviewer. Just as a heads up though, I've got a fairly sizeable backlog of PRs to review, so apologies in advance if it takes me a while to get to this one.

(I've also been working on trying to onboard additional reviewers to the project, and so it's possible someone else may pick up this review before I get to it.)

@argothiel

argothiel commented Mar 21, 2026

Copy link
Copy Markdown
Contributor Author

@HighCommander4 Thank you, Nathan, I'll wait patiently, I can imagine the work of the main maintainer of clangd can't be easy. If I can make it any easier for you, like answer any questions or walk you through the code logic, just let me know.

Comment thread clang-tools-extra/clangd/unittests/CodeCompleteTests.cpp
Comment thread clang/unittests/Lex/LexerTest.cpp
Add tests for ()=<> for Sema and no Sema.
Add tests for ()=<>{} for Lexer.
@argothiel argothiel requested a review from JVApen March 21, 2026 22:13
@JVApen JVApen requested review from JVApen and removed request for JVApen March 22, 2026 06:08
@argothiel

Copy link
Copy Markdown
Contributor Author

Can someone with Maintainer access approve the CI tests run?

@argothiel

Copy link
Copy Markdown
Contributor Author

Ping

@timon-ul timon-ul left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I have been looking at NextToken and its uses again. What a surprise to find out it is actually related to the FIXME you found in CodeCompleteTests.cpp. The whole idea of the commit is to surpress the creation of additional parenthesis, which in some cases it does, but in others it does not, the issue is tracked here (so you have the already relevant context if you decide to work on that as a follow up). As you can see though, back in that commit the way to get the NextToken was by calling Lexer::findNextToken which caused other problems why @HighCommander4 actually fixed this behaviour 2 years ago here. All this makes me even more convinced, that the use case of NextToken and your reason to create findEndOfIdentifierContinuation are heavily overlapping and thus I am convinced we should not duplicate code and calls here, but instad decide on one solution. I am fine by simply adapting your new function for this, but maybe @HighCommander4 could voice his thoughts too given he wrote findTokenAfterCompletionPoint.

@argothiel

Copy link
Copy Markdown
Contributor Author

I agree that we don't need two helpers. The new method fixes the issues with the old one and could replace it.

Note that the logic in both methods is slightly different. findTokenAfterCompletionPoint assumes we're at the end of a token and is looking for the next token (ignoring whitespace); findEndOfIdentifierContinuation looks for the end of the identifier, parsing more thoroughly, and then we are already at the beginning of the next token.

This means we cannot reuse or rewrite the old method to express the new one: findEndOfIdentifierContinuation is a strictly more powerful primitive. On the other hand, we can express findTokenAfterCompletionPoint as findEndOfIdentifierContinuation + a raw-lex of one token from the returned location.

So the proposal: keep findEndOfIdentifierContinuation (it needs to be in clang::Lexer because it requires the lexer's internals), delete findTokenAfterCompletionPoint from clangd, and add a small raw-lex-one-token helper that takes the result of findEndOfIdentifierContinuation as input. Doing that unification would fix the FIXME at CodeCompleteTests.cpp by kirillbobyrev.

Happy to do it in this PR if you'd like, or split it into a follow-up to keep this one focused on InsertReplaceEdit and not changing the existing logic. Or maybe I should deliver the bug fix first and then return to this? What do you think?

@argothiel argothiel requested a review from timon-ul May 9, 2026 17:27
@timon-ul

timon-ul commented May 9, 2026

Copy link
Copy Markdown
Contributor

Happy to do it in this PR if you'd like, or split it into a follow-up to keep this one focused on InsertReplaceEdit and not changing the existing logic.

You got me convinced. Let's do it in a follow up PR and not waste effort for a makeshift solution we will just axe again. I was hesistant, because I would prefer to keep the codebase clean, but I agree that if you already touch the other function it makes more sense to combine it with fixing the lacking behaviour and at that point it makes sense for it to be its own PR.

findTokenAfterCompletionPoint assumes we're at the end of a token and is looking for the next token (ignoring whitespace)

Just wanted to note that I don't think ignoring the whitespace is desired behaviour, at least I cannot see any reason as to when it would be.

Anyway I would be ready to merge this once you resolve the pending merge conflicts with main.

Edit: nvm I do it myself I noticed my other PR created those conflicts.

@timon-ul timon-ul left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@timon-ul timon-ul merged commit 3b4499c into llvm:main May 10, 2026
11 checks passed
@github-actions

Copy link
Copy Markdown

@argothiel Congratulations on having your first Pull Request (PR) merged into the LLVM Project!

Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR.

Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues.

How to do this, and the rest of the post-merge process, is covered in detail here.

If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again.

If you don't get any reports, no action is required from you. Your changes are working as expected, well done!

@llvm-ci

llvm-ci commented May 10, 2026

Copy link
Copy Markdown

LLVM Buildbot has detected a new failure on builder sanitizer-x86_64-linux-fast running on sanitizer-buildbot4 while building clang-tools-extra,clang at step 2 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/169/builds/22723

Here is the relevant piece of the build log for the reference
Step 2 (annotate) failure: 'python ../sanitizer_buildbot/sanitizers/zorg/buildbot/builders/sanitizers/buildbot_selector.py' (failure)
...
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:569: note: using lld-link: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:569: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:569: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:569: note: using ld.lld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/ld.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:569: note: using lld-link: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:569: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:569: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/main.py:74: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 98657 of 99228 tests, 64 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60..
FAIL: Clangd Unit Tests :: ./ClangdTests/30/87 (19903 of 98657)
******************** TEST 'Clangd Unit Tests :: ./ClangdTests/30/87' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:/home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/tools/clang/tools/extra/clangd/unittests/./ClangdTests-Clangd Unit Tests-719314-30-87.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=87 GTEST_SHARD_INDEX=30 /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/tools/clang/tools/extra/clangd/unittests/./ClangdTests
--

Note: This is test shard 31 of 87.
[==========] Running 16 tests from 15 test suites.
[----------] Global test environment set-up.
[----------] 1 test from BackgroundIndex
[ RUN      ] BackgroundIndex.Profile
[       OK ] BackgroundIndex.Profile (1 ms)
[----------] 1 test from BackgroundIndex (1 ms total)

[----------] 2 tests from CompletionTest
[ RUN      ] CompletionTest.Documentation
Built preamble of size 209292 for file /clangd-test/foo.cpp version null in 0.02 seconds
Code complete: sema context Expression, query scopes [] (AnyScope=false), expected type int
Code complete: 88 results from Sema, 0 from Index, 0 matched, 0 from identifiers, 88 returned.
[       OK ] CompletionTest.Documentation (49 ms)
[ RUN      ] CompletionTest.ReplaceRange
Built preamble of size 209288 for file /clangd-test/foo.cpp version null in 0.01 seconds
Code complete: sema context DotMemberAccess, query scopes [] (AnyScope=false), expected type <none>
Code complete: 1 results from Sema, 0 from Index, 0 matched, 0 from identifiers, 1 returned.
Built preamble of size 209288 for file /clangd-test/foo.cpp version null in 0.01 seconds
Code complete: sema context DotMemberAccess, query scopes [] (AnyScope=false), expected type <none>
Code complete: 1 results from Sema, 0 from Index, 0 matched, 0 from identifiers, 1 returned.
Built preamble of size 209288 for file /clangd-test/foo.cpp version null in 0.01 seconds
Code complete: sema context DotMemberAccess, query scopes [] (AnyScope=false), expected type <none>
Code complete: 1 results from Sema, 0 from Index, 0 matched, 0 from identifiers, 1 returned.
Built preamble of size 209288 for file /clangd-test/foo.cpp version null in 0.01 seconds
 #0 0x00005b2ca2cd2996 ___interceptor_backtrace /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:4556:13
 #1 0x00005b2ca46402e4 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Support/Unix/Signals.inc:884:8
 #2 0x00005b2ca463afd7 llvm::sys::RunSignalHandlers() /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Support/Signals.cpp:0:5
 #3 0x00005b2ca4642ba7 SignalHandler(int, siginfo_t*, void*) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Support/Unix/Signals.inc:448:38
 #4 0x000071bf5fe47e10 (/usr/lib/x86_64-linux-gnu/libc.so.6+0x47e10)
 #5 0x000071bf5feae3ec pthread_kill (/usr/lib/x86_64-linux-gnu/libc.so.6+0xae3ec)
 #6 0x000071bf5fe47cde raise (/usr/lib/x86_64-linux-gnu/libc.so.6+0x47cde)
Step 10 (stage2/asan_ubsan check) failure: stage2/asan_ubsan check (failure)
...
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:569: note: using lld-link: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:569: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:569: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:569: note: using ld.lld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/ld.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:569: note: using lld-link: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:569: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:569: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/main.py:74: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 98657 of 99228 tests, 64 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60..
FAIL: Clangd Unit Tests :: ./ClangdTests/30/87 (19903 of 98657)
******************** TEST 'Clangd Unit Tests :: ./ClangdTests/30/87' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:/home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/tools/clang/tools/extra/clangd/unittests/./ClangdTests-Clangd Unit Tests-719314-30-87.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=87 GTEST_SHARD_INDEX=30 /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/tools/clang/tools/extra/clangd/unittests/./ClangdTests
--

Note: This is test shard 31 of 87.
[==========] Running 16 tests from 15 test suites.
[----------] Global test environment set-up.
[----------] 1 test from BackgroundIndex
[ RUN      ] BackgroundIndex.Profile
[       OK ] BackgroundIndex.Profile (1 ms)
[----------] 1 test from BackgroundIndex (1 ms total)

[----------] 2 tests from CompletionTest
[ RUN      ] CompletionTest.Documentation
Built preamble of size 209292 for file /clangd-test/foo.cpp version null in 0.02 seconds
Code complete: sema context Expression, query scopes [] (AnyScope=false), expected type int
Code complete: 88 results from Sema, 0 from Index, 0 matched, 0 from identifiers, 88 returned.
[       OK ] CompletionTest.Documentation (49 ms)
[ RUN      ] CompletionTest.ReplaceRange
Built preamble of size 209288 for file /clangd-test/foo.cpp version null in 0.01 seconds
Code complete: sema context DotMemberAccess, query scopes [] (AnyScope=false), expected type <none>
Code complete: 1 results from Sema, 0 from Index, 0 matched, 0 from identifiers, 1 returned.
Built preamble of size 209288 for file /clangd-test/foo.cpp version null in 0.01 seconds
Code complete: sema context DotMemberAccess, query scopes [] (AnyScope=false), expected type <none>
Code complete: 1 results from Sema, 0 from Index, 0 matched, 0 from identifiers, 1 returned.
Built preamble of size 209288 for file /clangd-test/foo.cpp version null in 0.01 seconds
Code complete: sema context DotMemberAccess, query scopes [] (AnyScope=false), expected type <none>
Code complete: 1 results from Sema, 0 from Index, 0 matched, 0 from identifiers, 1 returned.
Built preamble of size 209288 for file /clangd-test/foo.cpp version null in 0.01 seconds
 #0 0x00005b2ca2cd2996 ___interceptor_backtrace /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/compiler-rt/lib/asan/../sanitizer_common/sanitizer_common_interceptors.inc:4556:13
 #1 0x00005b2ca46402e4 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Support/Unix/Signals.inc:884:8
 #2 0x00005b2ca463afd7 llvm::sys::RunSignalHandlers() /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Support/Signals.cpp:0:5
 #3 0x00005b2ca4642ba7 SignalHandler(int, siginfo_t*, void*) /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Support/Unix/Signals.inc:448:38
 #4 0x000071bf5fe47e10 (/usr/lib/x86_64-linux-gnu/libc.so.6+0x47e10)
 #5 0x000071bf5feae3ec pthread_kill (/usr/lib/x86_64-linux-gnu/libc.so.6+0xae3ec)
 #6 0x000071bf5fe47cde raise (/usr/lib/x86_64-linux-gnu/libc.so.6+0x47cde)

@argothiel

Copy link
Copy Markdown
Contributor Author

Thank you, @timon-ul. I'll work on the other issue. The whitespace is a bit tricky to decide, I can imagine that in myfun^ <T>(x, y), it makes sense to look past the whitespace to decide about skipping the parens. But I think it's a discussion for another PR.

@argothiel argothiel deleted the argothiel/insertReplace branch May 10, 2026 21:14
@HighCommander4

Copy link
Copy Markdown
Contributor

@argothiel before you move on to follow-up work, the ASAN error indicated in this comment needs some further investigation.

It's showing memory corruption occurring during the execution of two newly added tests, CompletionTest.ReplaceRange and CompletionTest.ReplaceRangeNoCompiler. You can see the full stacks here.

@argothiel

Copy link
Copy Markdown
Contributor Author

@HighCommander4 Looking into it.

@timon-ul

Copy link
Copy Markdown
Contributor

@argothiel I had a first look and it seems to be related to your smileys (the UTF8 character) creating some "global-buffer-overflow" when the names are handled later on in FuzzyMatch.cpp:149

@argothiel

argothiel commented May 11, 2026

Copy link
Copy Markdown
Contributor Author

@timon-ul Yes, that's correct. The calculateRoles function has a bug which this PR revealed.

@timon-ul

timon-ul commented May 11, 2026

Copy link
Copy Markdown
Contributor

@argothiel idk how deep you are in your investigation, but I think I found the bug. The function here has the parameter int which on a UTF8 character produces a negative index lookup. The bits are correct but we need the system to treat them as such. I am honestly not very used to dealing with bits and switching between signed and unsigned and do not think it is enough to just change the type in the function to unsigned int, so there is more testing required. Here is what I think is going wrong: https://godbolt.org/z/xq8fT4Yba simply remove the unsigned to get the current behaviour, while with signed this is what is expected as the index lookup (the bits I took here are the first 8 from the smiley).

Edit: the reason I am not 100% confident this is exactly the problem is because the asan is reporting a lookup 15 bytes before the CharTypes lookup table, but my example above produces a negative index of -4, which is not that many bytes before it and leaves me still a bit puzzled.

Edit2: Actually if I directly print out (int) a in the example above I get -16 which would perfectly match to the ASAN error, now I feel confident again.

@argothiel

argothiel commented May 11, 2026

Copy link
Copy Markdown
Contributor Author

@timon-ul I've had a fix for a couple of hours now. It is enough to change it to unsigned; I've decided to go with unsigned char to reflect the usage (and protect from huge ints), and I've added a unit test.

However I'm waiting for all the sanitizers to pass locally before opening a PR, which takes quite a while.

@argothiel

argothiel commented May 11, 2026

Copy link
Copy Markdown
Contributor Author

@timon-ul:
dd86fb7

@timon-ul

Copy link
Copy Markdown
Contributor

@argothiel alright, looks clean, really like making it an unsigned char so now every lookup is forced to be in the array.

argothiel added a commit to argothiel/llvm-project that referenced this pull request May 11, 2026
The `packedLookup` function doesn't work correctly with byte values over
0x7F (e.g. UTF-8 high bytes) on platforms where `char` is signed (like
x86-64). The character is treated as a negative `char`, which gets
converted to a negative `int`, which makes `I >> 2` negative, which
gives a negative index, and thus an out-of-bounds read.

The fix is to change the `int` parameter type to `unsigned char`, to
always get the value in the 0x00..0xFF range.

The issue has been discovered by the sanitizer buildbot after the
llvm#187623 merge, which first introduced tests with non-ASCII source
content into the code-completion path.
timon-ul pushed a commit that referenced this pull request May 11, 2026
The `packedLookup` function doesn't work correctly with byte values over
0x7F (e.g. UTF-8 high bytes) on platforms where `char` is signed (like
x86-64). The character is treated as a negative `char`, which gets
converted to a negative `int`, which makes `I >> 2` negative, which
gives a negative index, and thus an out-of-bounds read.

The fix is to change the `int` parameter type to `unsigned char`, to
always get the value in the 0x00..0xFF range.

The issue has been discovered by the sanitizer buildbot after the
#187623 merge, which first introduced tests with non-ASCII source
content into the code-completion path.
EuphoricThinking pushed a commit to EuphoricThinking/llvm-project that referenced this pull request May 14, 2026
Handle new insertReplaceSupport capability (defined in LSP 3.16). Add
the new option to the protocol layer and pass it around to the code
completion logic. Update CompletionItem::textEdit to become the union
type as per the LSP specification.

Add a new helper function to the Lexer public API to find the end of an
identifier with full context lexing, to avoid duplicating the logic. Use
the helper both in the Sema flow and in the comment completion flow. Use
a simpler ASCII-only scan in no-Sema mode.

Add LIT tests to verify auto-triggered completions, mid-word
replacement, Unicode, and snippets. Add unit tests to verify
insert/replace ranges with and without Sema, including comments and the
feature-off case.

Update the release notes to document the new capability.

Fixes clangd/clangd#2190

---------

Co-authored-by: timon-ul <timon.ulrich@advantest.com>
EuphoricThinking pushed a commit to EuphoricThinking/llvm-project that referenced this pull request May 14, 2026
The `packedLookup` function doesn't work correctly with byte values over
0x7F (e.g. UTF-8 high bytes) on platforms where `char` is signed (like
x86-64). The character is treated as a negative `char`, which gets
converted to a negative `int`, which makes `I >> 2` negative, which
gives a negative index, and thus an out-of-bounds read.

The fix is to change the `int` parameter type to `unsigned char`, to
always get the value in the 0x00..0xFF range.

The issue has been discovered by the sanitizer buildbot after the
llvm#187623 merge, which first introduced tests with non-ASCII source
content into the code-completion path.
pedroMVicente pushed a commit to pedroMVicente/llvm-project that referenced this pull request May 19, 2026
Handle new insertReplaceSupport capability (defined in LSP 3.16). Add
the new option to the protocol layer and pass it around to the code
completion logic. Update CompletionItem::textEdit to become the union
type as per the LSP specification.

Add a new helper function to the Lexer public API to find the end of an
identifier with full context lexing, to avoid duplicating the logic. Use
the helper both in the Sema flow and in the comment completion flow. Use
a simpler ASCII-only scan in no-Sema mode.

Add LIT tests to verify auto-triggered completions, mid-word
replacement, Unicode, and snippets. Add unit tests to verify
insert/replace ranges with and without Sema, including comments and the
feature-off case.

Update the release notes to document the new capability.

Fixes clangd/clangd#2190

---------

Co-authored-by: timon-ul <timon.ulrich@advantest.com>
pedroMVicente pushed a commit to pedroMVicente/llvm-project that referenced this pull request May 19, 2026
The `packedLookup` function doesn't work correctly with byte values over
0x7F (e.g. UTF-8 high bytes) on platforms where `char` is signed (like
x86-64). The character is treated as a negative `char`, which gets
converted to a negative `int`, which makes `I >> 2` negative, which
gives a negative index, and thus an out-of-bounds read.

The fix is to change the `int` parameter type to `unsigned char`, to
always get the value in the 0x00..0xFF range.

The issue has been discovered by the sanitizer buildbot after the
llvm#187623 merge, which first introduced tests with non-ASCII source
content into the code-completion path.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:frontend Language frontend issues, e.g. anything involving "Sema" clang-tools-extra clangd

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support InsertReplaceEdit

7 participants