Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

java.lang.IllegalArgumentException when running RecoverClassesFromRTTIScript.java script #7516

Open
redfast00 opened this issue Feb 17, 2025 · 15 comments
Assignees
Labels
Feature: Scripting Status: Internal This is being tracked internally by the Ghidra team Status: Waiting on customer Waiting for customer feedback
Milestone

Comments

@redfast00
Copy link
Contributor

I get the following stack trace when running RecoverClassesFromRTTIScript.java

java.lang.reflect.InvocationTargetException
	at ghidra.app.plugin.core.analysis.AutoAnalysisManager$AnalysisWorkerCommand.applyTo(AutoAnalysisManager.java:1705)
	at ghidra.app.plugin.core.analysis.AutoAnalysisManager$AnalysisWorkerCommand.applyTo(AutoAnalysisManager.java:1586)
	at ghidra.app.plugin.core.analysis.AutoAnalysisManager$AnalysisTaskWrapper.run(AutoAnalysisManager.java:660)
	at ghidra.app.plugin.core.analysis.AutoAnalysisManager.startAnalysis(AutoAnalysisManager.java:760)
	at ghidra.app.plugin.core.analysis.AutoAnalysisManager.startAnalysis(AutoAnalysisManager.java:639)
	at ghidra.app.plugin.core.analysis.AutoAnalysisManager.startAnalysis(AutoAnalysisManager.java:604)
	at ghidra.app.plugin.core.analysis.AnalysisBackgroundCommand.applyTo(AnalysisBackgroundCommand.java:55)
	at ghidra.app.plugin.core.analysis.AnalysisBackgroundCommand.applyTo(AnalysisBackgroundCommand.java:33)
	at ghidra.framework.plugintool.mgr.BackgroundCommandTask.run(BackgroundCommandTask.java:103)
	at ghidra.framework.plugintool.mgr.ToolTaskManager.run(ToolTaskManager.java:351)
	at java.base/java.lang.Thread.run(Thread.java:1575)
Caused by: java.lang.IllegalArgumentException: Number of array elements may not be negative [-1]
	at ghidra.program.model.data.ArrayDataType.<init>(ArrayDataType.java:87)
	at ghidra.program.model.data.ArrayDataType.<init>(ArrayDataType.java:64)
	at classrecovery.RTTIGccClassRecoverer.createVmiClassTypeInfoStructure(RTTIGccClassRecoverer.java:3517)
	at classrecovery.RTTIGccClassRecoverer.getOrCreateVmiTypeinfoStructure(RTTIGccClassRecoverer.java:2755)
	at classrecovery.RTTIGccClassRecoverer.createTypeinfoStructs(RTTIGccClassRecoverer.java:2364)
	at classrecovery.RTTIGccClassRecoverer.createRecoveredClasses(RTTIGccClassRecoverer.java:182)
	at RecoverClassesFromRTTIScript.run(RecoverClassesFromRTTIScript.java:307)
	at ghidra.app.script.GhidraScript.executeNormal(GhidraScript.java:405)
	at ghidra.app.script.GhidraScript$1.analysisWorkerCallback(GhidraScript.java:387)
	at ghidra.app.plugin.core.analysis.AutoAnalysisManager$AnalysisWorkerCommand.applyTo(AutoAnalysisManager.java:1699)
	... 10 more

---------------------------------------------------
Build Date: 2025-Feb-05 1536 EST
Ghidra Version: 11.3
Java Home: /usr/lib/jvm/java-23-openjdk
JVM Version: Arch Linux 23.0.2
OS: Linux 6.13.2-arch1-1 amd64

To Reproduce

Run the RecoverClassesFromRTTIScript.java script. I'm afraid I can't share the binary, but I can answer questions quickly.

The binary is ELF 64-bit LSB executable, ARM aarch64, version 1 (GNU/Linux), dynamically linked. There are multiple compilers in the .comment field in the ELF; GCC and clang.

Environment (please complete the following information):

  • OS: Arch Linux (Linux 6.13.2-arch1-1 amd64)
  • Java Version: Arch Linux 23.0.2
  • Ghidra Version: 11.3
  • Ghidra Origin: Github release (AUR package ghidra-git-bin)

cc @astrelsky @ghidra007 because you worked on similar issues before

@ghidra007
Copy link
Contributor

@redfast00 Are you able to pull from github and do a build? If so, I can add some debug messages that will help identify the issue. Just glancing at your stack trace it appears that it is getting an incorrect value from what it thinks is a vmi typeinfo structure. This makes me think it is incorrectly identifying it in the first place. If you are able to build then we can try and triage this together. I can also add checks for this so that it doesn't blow up but I am curious to know why it is happening in the first place.

@redfast00
Copy link
Contributor Author

@ghidra007 yes, I can

@ryanmkurtz ryanmkurtz added Status: Internal This is being tracked internally by the Ghidra team and removed Status: Triage Information is being gathered labels Feb 19, 2025
@ryanmkurtz ryanmkurtz added this to the 11.3.2 milestone Feb 19, 2025
@ghidra007
Copy link
Contributor

ghidra007 commented Feb 19, 2025

@redfast00 The change was pushed out. The script shouldn't blow up there anymore but instead will skip that typeinfo structure. Once you make a build and run it can you go into the log and find the debug message with the following format:

: VmiTypeinfoStructure has invalid number of bases: "

If you go to the typeinfo address and make a pointer at that address it should be a pointer to either the address just below this

__cxxabiv1::__vmi_class_type_info::typeinfo if it is contained in a block with real memory under it or it should point to an external block address with both this symbol and the correspoinding vtable symbol. Are you seeing either of these? If so, which?

Also, can you look at the bytes in the field next to the pointer you created. Are the bytes representative of the address it points to or off by some amount - likely 8 or 16? If off, then it indicates relocations are being used. I'm wondering if we are not handling the relocations well in your program and so the vmi typeinfo is being found in the wrong location. If you have relocations and they are fixed up correctly, you should see them marked as APPLIED in the Window-Relocation Table. I'm not sure if you will see relocations listed with no info or if you will not see any at all if they are not fixed up.

Can you also make another pointer just below the first one? See if it points to a mangled string which may or may not be created already.

I would also be curious if there are other issues with other typeinfos not being created correctly. Are there any more blowups or does the script complete? If all the rest look good then it is probably just a problem with not doing enough checks for this particular vmi structure. If all are wrong, it indicates a relocation issue or some other general processing issue in your program. You can check by looking for other references to the __cxxabiv1::__vmi_class_type_info::typeinfo, the __cxxabiv1::__si_class_type_info::typeinfo, and the __cxxabiv1::__class_type_info::typeinfo. Do you see good typeinfo structures laid down there once the script completes?

If you have any other log information pertaining to this script after it runs or any log information about the import and not being able to handle relocations or other import or analysis log errors/warnings, I would be interested in those as well.

Thanks! Hopefully we can get to the bottom of what is happening here.

@redfast00
Copy link
Contributor Author

@ghidra007 it does work a lot better now (it can entirely complete running the plugin), but there are indeed a lot of errors in the console:

DEBUG 0118b400: VmiTypeinfoStructure has invalid number of bases: 0  (RTTIGccClassRecoverer.java:2747) 
DEBUG 0118e300: VmiTypeinfoStructure has invalid number of bases: -1  (RTTIGccClassRecoverer.java:2747) 
...

(there are a lot more errors, about half are base 0, other half base -1

Image

I then interpreted the address above and below as pointer:

Image

The pointer in the middle points to:

Image

There are two unsupported relocations in the binary:

Image

There are a lot of other errors:

DEBUG Could not get typeinfo string from 01137900  (RTTIGccClassRecoverer.java:2796) 
DEBUG Could not get typeinfo string from 4862a5b4  (RTTIGccClassRecoverer.java:2899) 
DEBUG Could not create typeinfo symbol at 01137900  (RTTIGccClassRecoverer.java:2389) 
DEBUG Could not get typeinfo string from 01137d00  (RTTIGccClassRecoverer.java:2796) 
DEBUG Could not get typeinfo string from 64ba864c  (RTTIGccClassRecoverer.java:2899) 
DEBUG Could not create typeinfo symbol at 01137d00  (RTTIGccClassRecoverer.java:2389) 
DEBUG Could not get typeinfo string from 01139500  (RTTIGccClassRecoverer.java:2796) 
DEBUG Could not get typeinfo string from ffffffff918a3598  (RTTIGccClassRecoverer.java:2899) 
...
DEBUG 0118b400: VmiTypeinfoStructure has invalid number of bases: 0  (RTTIGccClassRecoverer.java:2747) 
DEBUG 0118e300: VmiTypeinfoStructure has invalid number of bases: -1  (RTTIGccClassRecoverer.java:2747) 
...
DEBUG 010aec30: invalid typeinfo - no special classtypeinfo ref'd by baseTypeinfo[1]  (RTTIGccClassRecoverer.java:2556) 
DEBUG 010b0958: invalid typeinfo - no special classtypeinfo ref'd by baseTypeinfo[0]  (RTTIGccClassRecoverer.java:2556) 
...
DEBUG Invalid vtable at 01086938  (RTTIGccClassRecoverer.java:1063)
....
DEBUG MISSING expected vtable for simple class REDACTED
DEBUG Cannot find vtt for vtable at 010cfa60  (RTTIGccClassRecoverer.java:800) 
DEBUG Cannot find vtt for vtable at 010cfb08  (RTTIGccClassRecoverer.java:800) 
...
ERROR Parent class's RecoveredClass object for : REDACTED should already exist.   (RTTIGccClassRecoverer.java:3566) 
ERROR Removing class: REDACTED from list to process since parent information is not availalbe.  (RTTIGccClassRecoverer.java:664) 
ERROR Parent class's RecoveredClass object for REDACTED should already exist.   (RTTIGccClassRecoverer.java:3566) 
ERROR Removing class: REDACTED from list to process since parent information is not availalbe.  (RTTIGccClassRecoverer.java:664) 

@ghidra007
Copy link
Contributor

Can you tell me what the warning bookmark says at address 118b400?

Also, what is at address 460648f0? Does it look like a mangled string? If so, can you turn it into a string and then use the GnuDemanglerScrip.java script (attached) to demangle it? You will have to copy the string and edit the script to hardcode that string. The script must be placed in your <ghidra_install>/Ghidra/Features/GnuDemangler/ghidra_scripts folder. See the comment above the hard coded string example for how to possibly manipulate the string to get it to demangle. Does it demangle to "typeinfo name for <that class's name"?

If you look at the other places that reference the vmi typeinfo (11e3268) do they look like good typeinfo structs?

@ghidra007
Copy link
Contributor

ghidra007 commented Feb 20, 2025

GnuDemanglerScript.txt

NOTE: you will have to rename it from .txt to .java.

@redfast00
Copy link
Contributor Author

@ghidra007

Can you tell me what the warning bookmark says at address 118b400?

Image

Also, what is at address 460648f0?

This address is not in the binary: in the memory map, there aren't any addresses starting with 4

If you look at the other places that reference the vmi typeinfo (11e3268) do they look like good typeinfo structs?

This is the first place that references the vmi typeinfo; I don't really know what to look for, does this look good? The warnings are EXTERNAL relocation warnings as well.

Image

@astrelsky
Copy link
Contributor

astrelsky commented Feb 26, 2025

415249268-60d57809-a73d-4964-9f99-5fc056790e8c.png

What section is this in and what does the data around it look like? This is suspicious and I don't think this instance is valid. I'm wondering if it is choking on some unrelated reference to the vmi class type info.

I find the issues with the itanium abi rtti analysis to be more of refusing to do any analysis and lying saying there is no rtti while I'm looking right at it.

Edit: I think either something is wrong with the relocations or some analyzer went haywire and created an invalid reference there. A pointer to __vmi_class_typeinfo::vtable+0x4xxxxxx makes no sense.

@redfast00
Copy link
Contributor Author

What section is this in and what does the data around it look like? This is suspicious and I don't think this instance is valid. I'm wondering if it is choking on some unrelated reference to the vmi class type info.

This is in .data; the correct typeinfo structs (small offset, all 0x10) are in .data.rel.ro. All the pointers in .data have gigantic positive or negative offsets.

@ghidra007
Copy link
Contributor

ghidra007 commented Feb 26, 2025

@redfast00 The other typeinfo structs you shared look good to me. Is your image before or after the script ran? The script should be creating those.

Can you create an ImageBaseOffset64 at this location: 460648f0 and see if it creates a reference to anything valid? If so, does it reference a mangled typeinfo-name string?

I agree that the relocation amount in the bad vmi looks odd. I am going to ask our relocation expert to take a look and see if they can direct you for more information to determine if that is the issue.

For the good class structures can you tell me what their warning bookmark says?

Also, we need to see the entry in the Relocation Table for the bad vmi address and for a good one. So if you bring up that window filter on address 0118b400 for the bad one and also 010ae800 for a good one. Can you add the bytes column to that window so we can see that as well as the other columns you showed for the unsupported ones.

Also, I'm going to walk you through how to find the relocation structure entry so you can show us what that looks like.
First, figure out your image base by opening the memory map window and clicking on the house icon. The value in the resulting dialog's text field is your image base. Subtract the image base from 0118b400 to get the relative offset of the relocation. Now run Search Memory using the result of the subtraction (in hex) but first open the options panel and make sure to also select "All Other Blocks". If you click on each result in the list (should only have one or two probably) you should see one that is a Elf_Rela (or possibly a Elf_Rel) where the bytes you searched form the first member of that structure (the r_offset one). Once you find that structure can you post an image of it here or share the values for each structure member. We are most interested in the r_addend one for the found Elf_Rela.

@redfast00
Copy link
Contributor Author

Is your image before or after the script ran? The script should be creating those.

Before, this is after:

Image

Can you create an ImageBaseOffset64 at this location: 460648f0 and see if it creates a reference to anything valid? If so, does it reference a mangled typeinfo-name string?

I cannot: 460648f0 is not a valid address in this binary.

For the good class structures can you tell me what their warning bookmark says?

Image

Also, we need to see the entry in the Relocation Table for the bad vmi address and for a good one. So if you bring up that window filter on address 0118b400 for the bad one and also 010ae800 for a good one. Can you add the bytes column to that window so we can see that as well as the other columns you showed for the unsupported ones.

For your convenience, I used the export as CSV option

"Location","Status","Type","Values","Original Bytes","Name","Bytes"
"010ae800","APPLIED","0x101","0x4e","00 00 00 00 00 00 00 00","_ZTVN10__cxxabiv121__vmi_class_type_infoE","78 32 1E 01 00 00 00 00 70 C4 DA 00 00 00 00 00 00 00 00 00 02 00 00 00 08 EB 0A 01 00 00 00 00 02 00 00 00 00 00 00 00 D0 EA 0A 01 00 00 00 00 02 10 00 00 00 00 00 00"
"0118b400","APPLIED","0x101","0x4e","00 00 00 00 00 00 00 00","_ZTVN10__cxxabiv121__vmi_class_type_infoE","E8 FA 06 46 00 00 00 00"

First, figure out your image base by opening the memory map window and clicking on the house icon. The value in the resulting dialog's text field is your image base. Subtract the image base from 0118b400 to get the relative offset of the relocation. Now run Search Memory using the result of the subtraction (in hex) but first open the options panel and make sure to also select "All Other Blocks". If you click on each result in the list (should only have one or two probably) you should see one that is a Elf_Rela (or possibly a Elf_Rel) where the bytes you searched form the first member of that structure (the r_offset one). Once you find that structure can you post an image of it here or share the values for each structure member. We are most interested in the r_addend one for the found Elf_Rela.

  • Image base = 00400000
  • Relative offset of the relocation = 00d8b400
  • No results:

Image

(also no results when I selected big endian)

However, when I skipped the image base subtraction step, I did find:

Image

and

Image

@ghidra1
Copy link
Collaborator

ghidra1 commented Feb 27, 2025

@redfast00 Do you see both relocation record in the same binary? If so, can you indicate the name of the memory section (if available) that each is contained within. If they both exist, the implication is that the binary has been pre-linked and relocations should not be applied. We are not good at detecting and handling pre-linked ELF binaries. You could try importing the binary and disable the Import Option for Relocation Processing using default image base. Without the relocation processing the offset-reference will not be applied so I don't know how well the RecoverClassesFromRTTIScript will work in that situation. Could you try this and let us know what you see.

If relocations are being applied when they should not due to prelinking, the issues could go beyond the improper pointer markup you identified. After re-importing with relocation processing disabled, I suggesting doing a Program Diff between the two cases (Bytes only) to see what other area may have been affected.

@ghidra1
Copy link
Collaborator

ghidra1 commented Feb 27, 2025

@redfast00 Any "EXTERNAL" data relocation cannot be pre-linked. Do you have an "EXTERNAL" block in the program?

@ghidra1
Copy link
Collaborator

ghidra1 commented Feb 27, 2025

@redfast00 What operating system is this binary intended to load/execute under?

@ghidra007 ghidra007 added the Status: Waiting on customer Waiting for customer feedback label Feb 27, 2025
@ghidra1
Copy link
Collaborator

ghidra1 commented Feb 27, 2025

I am re-opening this ticket until we come to a better understanding on the relocation processing.

@ghidra1 ghidra1 reopened this Feb 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature: Scripting Status: Internal This is being tracked internally by the Ghidra team Status: Waiting on customer Waiting for customer feedback
Projects
None yet
Development

No branches or pull requests

5 participants