Bad sector repair

Have an idea how to make KryoFlux even better? Let us know...
Post Reply
Darkstar
Posts: 72
Joined: Thu Nov 04, 2010 7:58 pm

Bad sector repair

Post by Darkstar » Tue Nov 09, 2010 10:57 pm

Sometimes it can't be helped, even with -t999 I still get this from time to time:

Code: Select all

64.1    : MFM: <error>, trk: 064, sec: 18, bad: 1
64.1    : Bad sector found
64.1    : Read operation failed
Since chances are that on those 999 passes, that one sector that is marked "bad" was partially correct in different positions on the different tries, why not make it possible to do a sector repair on a per-byte basis through a histogram? i.e. if the 3rd byte of the sector is 0x44 in 766 of these 999 reads, one can be pretty sure that 0x44 is indeed the correct value. Rinse & repeat for all 512 bytes of the bad sector and one should get a pretty good sector-dump. Or not?

Obviously this should only apply to sector dumps (MFM in my case) and not to stream files

-Darkstar

User avatar
mr.vince
Posts: 2120
Joined: Tue Oct 05, 2010 5:48 pm

Re: Bad sector repair

Post by mr.vince » Wed Nov 10, 2010 12:21 am

Good idea, but really one that goes into the direction of data recovery... which is the field of the big money... I you ever have tried to get something salvaged, you know what I mean. :)

But in fact... a really nice one. Maybe something for a v2, or a different app.

philpem
Posts: 5
Joined: Tue Feb 01, 2011 2:58 pm

Re: Bad sector repair

Post by philpem » Tue Feb 01, 2011 3:12 pm

You can -- in theory -- run the CRC backwards to find the erroneous bits. It's basically how the old MFM hard drives worked -- generate a 32-bit CRC for each sector, then verify it when reading (note that CRC32(concat(data, CRC32(data))) = 0). If the CRC check fails, then the result of crc32(concat(data, crc_in_sector_trailer)) can be run backwards to produce an error 'syndrome' -- a map of which bits are wrong.

Catch is, there's a limit to how many bits you can correct -- it's either five or eight depending on the CRC32 polynomial you're using, and I'll bet it'll only be maybe two or four bits at best with a CRC16, if even that.

Finding sync errors is fairly easy though. On FM you look for missing clock bits, on MFM you look for illegal coding values. Find the locations of the sync errors and you'll probably be pretty close to the error. Generally speaking, you tend to lose magnetic pulses more often than you gain spurious ones, unless the disc has been AC bulk erased.

User avatar
OCMoe
Posts: 42
Joined: Thu Dec 02, 2010 1:29 am
Location: Oldenburg, Germany
Contact:

Re: Bad sector repair

Post by OCMoe » Fri Feb 04, 2011 5:27 pm

Feel free to correct me, but does CRC really allow you to correct errors? It allows you to detect multiple bit-errors, but not correct them, as there is no way to locate the erroneous bits (unless we're talking about very short blocks, in which case there might be). Simply do the math: Take all permutations of, say, 2-bit-errors in 512 bytes. There are 511+510+509+... possibilities, i.e. (512*511)/2 = 130816 combinations. A CRC16 has 65535 different values, so not even two-bit errors can be uniquely identified. Calculate accordingly for CRC32 and more bit errors.

Now if you were talking about ECC codes used in today's hard disks, they work differently and have nothing to do with floppies.

BTW, the exact amount that can be detected reliably varies with the polynominal and block size. The numbers you mention are among the more typical values. See this excellent article, page 6 for an interesting table about message size vs. CRC bits vs. detectable bit-errors, albeit only up to CRC16. It's an interesting read if you're curious about modern insights into CRCs. They also list uncommon polynominals which are optimal for certain situations.

MultiDisk
Posts: 14
Joined: Wed Jan 26, 2011 3:41 pm

Re: Bad sector repair

Post by MultiDisk » Wed Feb 09, 2011 9:18 am

Hmm.. CRC shouldn't let you correct errors, if you have no other information than a single set of data & a crc.. except here, we have a set of data & a crc mismatch.. plus, just possibly by repeated reads you have multiple sets of data, within which some values may be stable, and others fluctuating.. and maybe (depending on the exact info the h/w can report.. can it tell how 'fuzzy' a read is?) you can even say which values you are certain about, and which you are not..

At which point assuming the crc is stable/trustable, and there's a only a limited number of bytes in question, how many permutations of those bytes are then required to produce a total dataset that agrees with the crc ?

(Of course, if the crc cant be read reliably, and there are other areas in question.. all bets are off.. tho possibly, if only the crc is questionably, it could be rebuilt.. possibly even a rebuild crc could be compared against the questionable data, but that really would require some detailed info as to why the crc was considered read badly)

Darkstar
Posts: 72
Joined: Thu Nov 04, 2010 7:58 pm

Re: Bad sector repair

Post by Darkstar » Wed Feb 23, 2011 11:46 am

Just for the record, I didn't suggest using the CRC to somehow recover sector-data. I merely wanted to be able to do like 500 reads and then do a simple byte-wise "majority vote" of the 500 copies of the sector. I admit it will probably only work in sector-dump mode (not in stream dump mode) but I'd be willing to take that pill (i.e. having a working sector-dump is sometimes more important than having a 100% perfect flux dump)

User avatar
IFW
Posts: 3079
Joined: Mon Nov 08, 2010 2:42 pm

Re: Bad sector repair

Post by IFW » Wed Feb 23, 2011 1:03 pm

You can end up with seemingly good reads, despite the facts that they are not for real, especially for systems with weak EDCs, such as pretty much any system that is not based on a generic FM or MFM controller.
Remember, the reads are not constrained by hardware artefacts of PLL, etc that normally affect data read through controllers.

What you get is what you can get, a very slightly different track alignment due to seeking can help though.

You already have a read as far as it gets as a bad read is already stored in the sector image.
You could increase the number of retries as well.

Techniques related to fixing bad reads when read on target platforms are due to different capabilities of the target hardware and won't work once you go to the lowest level.
Instead, you already get better chances than the target would ever have, due to lack of processing capabilities, analysis etc. - FDCs are contrained by the need of real-time usage and other factors.

Although you can simulate similar "fixing" effects in software easily, there is no real reason to artificially create a fake a good/bad read... when you have proof that it's not good at all. It would be more like a comfort zone thing: you got used to thinking it worked, and would give the same feeling despite technically being incorrect.

What one should do instead is a complex analysis of the cells, highlighting weak spots and based on cell behaviour coming up with alternate data pathes. This could be further enhanced with various techniques, but I am not here to give advice for data recovery companies - they already have a very lucrative business :lol:

TeaRex
Posts: 120
Joined: Tue Nov 30, 2010 5:36 am

Re: Bad sector repair

Post by TeaRex » Thu Feb 24, 2011 1:56 am

I think that probably you'd have to be constructing sets of hypotheses about what certain parts of the signal are supposed to "mean", together with likeliness estimates for each hypothesis. And then you'd have to solve for the overall most likely set, or small number of sets that are then inspected manually. Much like what is commonly done in natural language recognition. Anyway it sounds like a fair bit of work.

Post Reply