- Obfuscating first step of encryption - 1 Update
- Invert every 2nd byte in a container of raw data - 4 Updates
Frederick Gotham <cauldwell.thomas@gmail.com>: Mar 31 11:50AM -0700 I have triple-posted this to sci.crypt, comp.lang.c++, comp.lang.c. (I would have taken more flack for cross-posting). I'm currently writing a paper on a cryptographic technique I have come up with. I hope to publish before the end of April. When encrypting/decrypting a file of size from about 500 kB to 500 GB, I shall aim for my technique to be comparatively fast to the likes of Rijndael, Twofish, Serpent. When encrypting/decrypting a small input, e.g. from 1 byte to 1 kB, my technique will be incredibly slow, as significant computation is required to load in the key and transform the first block. (The first block gets special treatment). If you had a million small files to encrypt separately, it would take thousands of times longer than if it were one large file. It is the time taken to load in the key and transform the first block that shall make my technique far less susceptible to brute-forcing. I hear that the Blowfish algorithm was admired for this reason because loading in the key is as expensive as encrypting 4 kB of data. Has anyone any thoughts on deliberately obfuscating the first step of an encryption algorithm, with the consequence of severe inefficiency when encrypting small files, but with the aim of making brute-forcing extremely time-consuming? I'm coding my technique in C++ for multi-threaded high performance on modern PC's (and supercomputers). I've got an x86-64 machine with 20 cores to test this on. Also I'm coding my technique single-threadedly in C for minimal ROM and RAM usage on microcontrollers. I won't be able to get it small enough for 8-bit microcontrollers, but a 32-Bit PIC should be more than powerful enough. |
Melzzzzz <Melzzzzz@zzzzz.com>: Mar 31 09:28AM >> seens begining of time... > Using undocumented features in software which needs to be > reliable is stupid. Undocumented CPU feature is reliable. There is not one CPU different then other in production, if same... -- press any key to continue or any other to quit... U ničemu ja ne uživam kao u svom statusu INVALIDA -- Zli Zec Svi smo svedoci - oko 3 godine intenzivne propagande je dovoljno da jedan narod poludi -- Zli Zec Na divljem zapadu i nije bilo tako puno nasilja, upravo zato jer su svi bili naoruzani. -- Mladen Gogala |
Bonita Montero <Bonita.Montero@gmail.com>: Mar 31 11:31AM +0200 > 4 128 bit units per core. 2 adds and 2 multiplies. You can freely check > at Agner's site... That's wrong. Ryzens before Zen2 have 2 * 64 bit add 2 * 64 bit mul. My benchmarks prove this. The asm mul-benchmark I've given in this thread is unrolled that, if you would be correct, you would get twice the throughput it actually gives. But it doesn't. |
Bonita Montero <Bonita.Montero@gmail.com>: Mar 31 11:32AM +0200 > Undocumented CPU feature is reliable. ... No, its undocumented and thereby not reliable. |
Bonita Montero <Bonita.Montero@gmail.com>: Mar 31 11:36AM +0200 > 4 128 bit units per core. 2 adds and 2 multiplies. You can freely check > at Agner's site... In this code ... ?fMul@@YQ_KXZ PROC vpxor xmm0, xmm0, xmm0 vpxor xmm1, xmm1, xmm1 mov rcx, 1000000000 / 10 avxMulLoop: vmulpd ymm0, ymm0, ymm1 vmulpd ymm0, ymm0, ymm1 vmulpd ymm0, ymm0, ymm1 vmulpd ymm0, ymm0, ymm1 vmulpd ymm0, ymm0, ymm1 vmulpd ymm0, ymm0, ymm1 vmulpd ymm0, ymm0, ymm1 vmulpd ymm0, ymm0, ymm1 vmulpd ymm0, ymm0, ymm1 vmulpd ymm0, ymm0, ymm1 dec rcx jnz avxMulLoop mov rax, 1000000000 ret ?fMul@@YQ_KXZ ENDP ... the CPU would alternately dispatch the VMULPD-instructions to the alleged two 128 bit mul execution-units. But the timings report different. |
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com. |
No comments:
Post a Comment