Giridhar Addepalli
2014-09-28 09:17:53 UTC
Hi All,
I am going through Quorum Journal Design document.
It is mentioned in Section 2.8 - In Accept Recovery RPC section
"
If the current on-disk log is missing, or a *different length *than the
proposed recovery, the JN downloads the log from the provided URI,
replacing any current copy of the log segment.
"
I can see it that the code follows above design
Source :: Journal.java
....
public synchronized void acceptRecovery(RequestInfo reqInfo,
SegmentStateProto segment, URL fromUrl)
throws IOException {
....
if (currentSegment == null ||
currentSegment.getEndTxId() != segment.getEndTxId()) {
....
} else {
LOG.info("Skipping download of log " +
TextFormat.shortDebugString(segment) +
": already have up-to-date logs");
}
....
}
....
My question is what if on-disk log is present and is of *same length *as
the proposed recovery
If JournalNode is skipping download because the logs are of same length,
then we could end up in a situation where finalized log segments contain
different data !
This could happen if we follow example 2.10.6
As per that example we write transactions (151-153 ) on JN1
then when recovery proceeded with only JN2 & JN3 let us assume that we
write again *different transactions* as (151-153) . Then after the crash
when we run recovery , JN1 will skip downloading correct segment from
JN2/JN3 as it thinks it has correct segment( as per the code pasted above).
This will result in a situation where finalized segment ( edits_151-153 )
on JN1 is different from finalized segment edits_151-153 on JN2/JN3.
Please let me know if i have gone wrong some where, and this situation is
taken care of.
Thanks,
Giridhar.
I am going through Quorum Journal Design document.
It is mentioned in Section 2.8 - In Accept Recovery RPC section
"
If the current on-disk log is missing, or a *different length *than the
proposed recovery, the JN downloads the log from the provided URI,
replacing any current copy of the log segment.
"
I can see it that the code follows above design
Source :: Journal.java
....
public synchronized void acceptRecovery(RequestInfo reqInfo,
SegmentStateProto segment, URL fromUrl)
throws IOException {
....
if (currentSegment == null ||
currentSegment.getEndTxId() != segment.getEndTxId()) {
....
} else {
LOG.info("Skipping download of log " +
TextFormat.shortDebugString(segment) +
": already have up-to-date logs");
}
....
}
....
My question is what if on-disk log is present and is of *same length *as
the proposed recovery
If JournalNode is skipping download because the logs are of same length,
then we could end up in a situation where finalized log segments contain
different data !
This could happen if we follow example 2.10.6
As per that example we write transactions (151-153 ) on JN1
then when recovery proceeded with only JN2 & JN3 let us assume that we
write again *different transactions* as (151-153) . Then after the crash
when we run recovery , JN1 will skip downloading correct segment from
JN2/JN3 as it thinks it has correct segment( as per the code pasted above).
This will result in a situation where finalized segment ( edits_151-153 )
on JN1 is different from finalized segment edits_151-153 on JN2/JN3.
Please let me know if i have gone wrong some where, and this situation is
taken care of.
Thanks,
Giridhar.