This is definitely not the ideal solution to solve this problem.
But I think this is better than destroying my RAM.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
We had better have a test for it. It is not a strict NFC and we best verify it works as we expect -- like if we miss any place which still uses StringRef refering to string in file data buffer, we may have dangling pointer after we free the buffer.
llvm/tools/llvm-profdata/llvm-profdata.cpp | ||
---|---|---|
444–448 | Please add some comment about why we need FunctionNames. I think it is because FunctionSamples and SampleRecord all use StringRef and the original strings refered to are in the data buffer. To remove the dependence on the data buffer, we want to use a name table to save all the name strings. The usage is not very straightforward. |
I agree with you that this is not completely NFC. However, I didn't really see what could best test this functionality (anyway, in our case, accessing dangling pointer would lead to SIGSEGV as we unmap the files).
Thanks. If we remove the code to replace string reference from file data buffer to FunctionNames set for call targets, will the test fail?
No, because this change is kind of a NFC, which implies that I want to keep the same behaviors as previously.
To actually have a test that fails if we remove this code, we would need to have a super heavy and time consuming test, which is, imo, not what we want for a test.
Do we really need to have a tests that fails if we remove this code ?
Is it possible to add an option to flush the file data buffer with 0 after the file is released? The option is only enabled for testing. With the option on, even with small test we can catch some difference in merge result if the patch is incorrect.
I realize I wasn't clear. I mean flush the data buffer after reading and merging for the file is done and just before the file is released.
I am very sorry, but I don't really understand what you want... The files are mmap'd with only READ protection, so zeroing the data isn't really an option!
Plus, we munmap the file when we destroy the object at the end of the loop, so accessing again the data would result in a segfault already... we cannot do much more here I think, but maybe I am missing smth ?
Sorry, I didn't have time to work on this the past 10 days..
From my last reply "I assumed the input files are small since we want to have small testcase. If the input files are small, the buffer won't be allocated through mmap but through new (From getOpenFileImpl in lib/Support/MemoryBuffer.cpp)".
Do you think we can construct a testcase with small inputs so the buffer could be writable?
Plus, we munmap the file when we destroy the object at the end of the loop, so accessing again the data would result in a segfault already...
If that is the case, why you didn't see SEGV when you remove the code to replace string reference from file data buffer to FunctionNames set for call targets? I think munmap area could possibly be remapped when reading in the next file?
Please add some comment about why we need FunctionNames. I think it is because FunctionSamples and SampleRecord all use StringRef and the original strings refered to are in the data buffer. To remove the dependence on the data buffer, we want to use a name table to save all the name strings. The usage is not very straightforward.