-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Scaffolding for CoreCLR interpreter intrinsics #116769
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cc @davidwrighton @jkotas still not really sure what to do here, this is the best idea I could come up with. |
The (non-optimizing) interpreter JIT only needs to recognize "must expand" intrinsic. It should be a small fraction of the intrinsic recognized by the optimizing JIT. I do not think it is worth sharing the infrastructure to recognize intrinsics with the optimizing JIT. It would be just a bunch of extra useless code that does unnecessary work. It would be useful to create list of "must expand" intrinsic that have to be implemented by the (non-optimizing) interpreter so that we do not need to chase one at a time. The canonical pattern for must expand intrinsics is trivial self-recursive call like this: Line 463 in 3c5f74a
A large set of "must expand" instrinsic are hardware and SIMD intrinsics. We should ignore those for now and disable hardware and SIMD acceleration in fully interpreted mode for now. |
I think it's inevitable that we will eventually want to share the jit's NamedIntrinsic enum and name-to-enum mapping, the Mono interpreter ended up having to implement lots of intrinsics. IIRC the reasoning for this is a mix of 'it's too slow without it' and 'mixed mode requires consistency with the AOT'd code' but vlad would know for sure. I agree that we're not ever going to implement all of them, and in some cases we can just make IsSupported/IsHardwareAccelerated return false. We could NIH a much simpler intrinsic lookup for the interpreter-only startup scenario, but it feels like we'd have to throw it out not too long from now. |
I am not sure about that.
I do not want us to turn the interpreter JIT into an optimizing JIT like it was in Mono. If we find that we need to build an optimizing interpreter JIT, it has to be built as RyuJIT backend so that we get all the RyuJIT optimizations for free (including all the intrinsic optimizations). The primary goal of this project is to minimize our long-term maintenance costs. Building multiple optimizing code generators would go against that goal. |
I've updated this PR with an implementation for a few intrinsics along with logic to bypass a bunch of ones that aren't must expand. The comments should also make it clear why must expand doesn't currently work (those intrinsics just get invoked via the JIT/R2R code since it's already there). Suggestions on how to fix that so I can restrict it to must expand intrinsics would be super helpful. |
I think the current rev of this is closer to what people wanted. It currently contains a workaround for a bug (somewhere) in our exception handling, and not all of the intrinsics are actually implemented, but I believe it identifies all of the must expand intrinsics from the list I created by auditing the source code manually. I'm not sure what a good test for this feature looks like. It might not be possible to write one. |
Interestingly, this seems to cause the arm64 lanes to crash and coredump: I wonder if that means we've got something wrong about our calling convention for native methods on arm64? EDIT: I suppose it's also possible that calling this intrinsic as if it were a method is just broken, since it has special handling in the JIT: // The vast majority of "special" intrinsics are Vector64/Vector128 methods.
// The only exception is ArmBase.Yield which should be treated differently.
if (intrinsic == NI_ArmBase_Yield)
{
assert(sig->numArgs == 0);
assert(JITtype2varType(sig->retType) == TYP_VOID);
assert(simdSize == 0);
return gtNewScalarHWIntrinsicNode(TYP_VOID, intrinsic);
} |
@kg The error message of the failure you've mentioned looks like this:
So it looks like that old EH issue. Thinking about the added nop, I have realized that the compiler actually removes nops IIRC, soe we may need to use e.g. INTOP_SAFEPOINT instead. Edit: And I actually already understand why the EH behaves like that and I am figuring out the best way to fix that. It happens because it compensates the instruction pointer like it does for JITted/AOTed code (it subtracts a constant from it). That is done because software exceptions are compiled as calls in JIT/AOT and so the address we get for the first frame is after the call. For interpreter though, the address is at the INTOP_THROW opcode, so no compensation is needed. That means that it behaves like a hardware exception with JIT/AOT where we are also on the faulting instruction and thus we don't do any compensation. |
Thanks for digging in to the EH issue and spotting that in the log. Somehow I overlooked the exception traceback when I dug through the console output. |
I've added and wired up I think once more of our open PRs land, mode 1 will probably work. Mode 2 and 3 are further off. |
Something about how we define InterpConfig is intentionally very weird, it seems like they're intentionally gated on DEBUG instead of FEATURE_INTERPRETER. Is that a mistake? As a result I have to gate the mode config variable on DEBUG as well and have intrinsics not work in release (or checked?) configurations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds foundational support for interpreting hardware intrinsics in the CoreCLR interpreter and exposes a new DOTNET_InterpMode
configuration to control interpreter usage. It also introduces a basic test to verify intrinsic behavior.
- Added
TestIntrinsics
in the JIT interpreter tests to validate PNSE behavior for hardware intrinsics. - Implemented
GetNamedIntrinsic
and wiring for intrinsic dispatch (includingINTOP_THROW_PNSE
) in the interpreter. - Extended interpreter configuration (
InterpMode
) with four interpreter modes to control fallback behavior.
Reviewed Changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
src/tests/JIT/interpreter/Interpreter.cs | New TestIntrinsics test and import for System.Runtime.Intrinsics . |
src/coreclr/vm/interpexec.cpp | Pass CORINFO_ACCESS_ANY to GetMultiCallableAddrOfCode ; add INTOP_THROW_PNSE handling. |
src/coreclr/interpreter/intrinsics.h | Header for GetNamedIntrinsic . |
src/coreclr/interpreter/intrinsics.cpp | Mapping from method metadata to NamedIntrinsic . |
src/coreclr/interpreter/intops.def | Defined INTOP_THROW_PNSE opcode. |
src/coreclr/interpreter/interpconfigvalues.h | Added InterpMode configuration and comments describing modes. |
src/coreclr/interpreter/eeinterp.cpp | Integrated InterpMode into interpreter decision logic. |
src/coreclr/interpreter/compiler.h | Declared EmitNamedIntrinsicCall . |
src/coreclr/interpreter/compiler.cpp | Implemented EmitNamedIntrinsicCall and hooked it into calls. |
src/coreclr/interpreter/CMakeLists.txt | Included intrinsics.cpp in the interpreter build. |
Comments suppressed due to low confidence (2)
src/coreclr/interpreter/interpconfigvalues.h:30
- The comment for mode 3 ends with a trailing comma and reads like it’s incomplete. Consider finishing the sentence or removing the trailing comma to clearly document all implied environment variable changes.
// 3: use interpreter for everything, the full interpreter-only mode, no fallbacks to R2R or JIT whatsoever. Implies DOTNET_ReadyToRun=0, DOTNET_EnableHWIntrinsic=0,
src/tests/JIT/interpreter/Interpreter.cs:891
- [nitpick] The
HACK
andFIXME
comments here signal temporary workarounds; consider clarifying next steps or removing commented-out code to keep the test implementation clean.
// HACK: Try block that should always throw
At present setting InterpMode to anything other than 0 causes a crash, recording it here for reference:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you!
Checkpoint Extract named intrinsic lookup out of JIT Compiler class Maybe fix zbb bit Basic wire-up of interpreter intrinsics Bypass more intrinsics Don't assert on unimplemented intrinsic by default, just don't handle it Add more intrinsics to the switch Fix linux build Revert changes to JIT side Checkpoint intrinsics rework Checkpoint Change EH workaround to something with a smaller blast radius Fix linux build Checkpoint: Add DOTNET_InterpMode config var and pipe it through Minor adjustments suggested offline Address PR feedback + maybe fix CI build Rearrange intrinsics test so it won't fail on ARM64 CI
Comment cleanups
Co-authored-by: Jan Kotas <[email protected]>
Co-authored-by: Milos Kotlar <[email protected]>
Windows x86 debug base64 seems to just actually be broken on main, that or we have multiple defective x86 CI machines. Not sure which is the most likely explanation, but 'checked coreclr windows x86 debug' keeps failing even if I rerun it in base64/xml stuff, which I haven't touched in this PR. |
Uh oh!
There was an error while loading. Please reload this page.