Generalized Bloom Filter
As the standard filter, the Generalized Bloom Filter is also a data structure used to represent a set S = {s1,s2,...,sn} of n elements in a compact form. It is constituted by an array of m bits and by k0 + k1 independent hash functions g1,g2,...,gk0,h1,h2,...,hk1 whose outputs are uniformly distributed over the discrete range {0,1,...,m - 1}. The GBF is built in a similar way to the standard filter. Nevertheless, the initial value of the bits of the array is not restricted to zero anymore. In the GBF, these bits can be initialized to any value. For each element si ∈ S, the bits corresponding to the positions g1(si),g2(si),...,gk0(si) are reset and the bits corresponding to the positions h1(si),h2(si),...,hk1(si) are set. In the case of a collision between a function gi and a function hj for the same element, we arbitrate that the resulting bit in the filter is always reset. The same bit can be set or reset several times without restrictions. Fig. 1 depicts how an element is inserted into a Generalized Bloom Filter (GBF). After inserting the elements, membership queries can be easily conducted. To check if an element x belongs to S, we check if the bits of the array corresponding to the positions g1(x),g2(x),...,gk0(x) are all reset and if the bits h1(x),h2(x),...,hk1(x) are all set. If at least one bit is inverted, then x ∉ S with high probability. In the GBF, it is possible that an element x ∈ S may not be recognized as an element of the set, creating a false negative. Such anomaly happens when at least one of the bits g1(x),g2(x),...,gk0(x) is set or one of the bits h1(x),h2(x),...,hk1(x) is reset by another element inserted afterwards. On the other hand, if no bit is inverted, then x ∈ S also with high probability. In fact, an element x ∉ S may be recognized as an element of the set, creating a false positive. A false positive occurs when the bits g1(x),g2(x),...,gk0(x) are all reset and the bits h1(x),h2(x),...,hk1(x) are all set due to other inserted elements or due to the initial condition of the bit array.
Fig. 1 - Insertion of an element into a Generalized Bloom Filter.The main advantage of the Generalized Bloom Filter is that both the false-positive and the false-negative ratios are upper bounded. In addition, these upper bounds depend only on the chosen parameters of the filter and not on its initial condition. As a result, it can be used in distributed applications that require a secure and space-efficient set representation. See [1] and [2] for details.
Publications
- Rafael P. Laufer , Pedro B. Velloso, Daniel de O. Cunha, Igor M. Moraes, Marco D. D. Bicudo, Marcelo D. D. Moreira, and Otto Carlos M. B. Duarte, "Towards Stateless Single-Packet IP Traceback," 32nd IEEE Conference on Local Computer Networks - LCN'2007, Dublin, Ireland, October 2007. [PDF]
- Rafael P. Laufer, Pedro B. Velloso, Otto Carlos M. B. Duarte, "Generalized Bloom Filters," Technical Report GTA-05-43, COPPE/UFRJ, September 2005. [PDF]
- Rafael P. Laufer, Pedro B. Velloso, Daniel de O. Cunha, Igor M. Moraes, Marco D. D. Bicudo, Marcelo D. D. Moreira, Otto Carlos M. B. Duarte, "Towards Stateless Single-Packet IP Traceback," Technical Report GTA-06-38, COPPE/UFRJ, December 2006. [PDF]
- Rafael P. Laufer, Pedro B. Velloso, Daniel de O. Cunha, Igor M. Moraes, Marco D. D. Bicudo, and Otto Carlos M. B. Duarte, "A New IP Traceback System against Denial-of-Service Attacks", 12th International Conference on Telecommunications - ICT'2005, May, 2005. Capetown, South Africa. [PDF]
In Portuguese
- Marcelo D. D. Moreira, Gustavo L. Coutinho, Igor M. Moraes, Rafael P. Laufer, and Otto Carlos M. B. Duarte, "RAT: Implementação de um Serviço de Rastreamento de Pacotes," XI Workshop de Gerência e Operação de Redes e Serviços - WGRS'2006, Curitiba, PR, Brazil, May 2006. [PDF]
- Rafael P. Laufer, Igor M. Moraes, Pedro B. Velloso, Marco D. D. Bicudo, Miguel Elias M. Campista, Daniel de O. Cunha, Luís Henrique M. K. Costa, and Otto Carlos M. B. Duarte, "Negação de Serviço: Ataques e Contramedidas", Minicursos do Simpósio Brasileiro de Segurança da Informação e de Sistemas Computacionais - SBSeg'2005, September 2005. Florianópolis, Brazil. [PDF]
- Rafael P. Laufer, Pedro B. Velloso, Daniel de O. Cunha, and Otto Carlos M. B. Duarte, "Um Procedimento Alternativo de Reconstrução de Rota para o Rastreamento de Pacotes IP", XXII Simpósio Brasileiro de Telecomunicações - SBrT'05, September 2005. Campinas, Brazil. [PDF]
- Rafael P. Laufer, Pedro B. Velloso, and Otto Carlos M. B. Duarte, "Um Novo Sistema de Rastreamento de Pacotes IP contra Ataques de Negação de Serviço", XXIII Simpósio Brasileiro de Redes de Computadores - SBRC'2005, May 2005. Fortaleza, Brazil. [PDF]
- Rafael P. Laufer, "Rastreamento de Pacotes IP contra Ataques de Negação de Serviço," M.Sc. Thesis, Universidade Federal do Rio de Janeiro - UFRJ, September 2005. [PDF]
Source Code
File Size Date Version MD5sum gbf-1.0.tar.gz 4KB 09/2005 1.0 2845d73cff3c2983aa3a7013142ffc71 This program implements the Generalized Bloom Filter using a universal class of hash functions. The class used is defined as follows. Let the elements be integers from the universe S={1, 2,..., z - 1} and let the output range of the hash functions be {0, 1,..., m - 1}. The class H of hash transformations hc,d which map an element x ∈ S into the respective range is
H = {hc,d(.) | 0 < c < z, 0 ≤ z}, (1)
where
hc,d(x) = [(cx + d) mod z] mod m. (2)
The numbers c and d are integers within the interval defined by (1). For each hash function, c and d are arbitrarily chosen. The integer z is defined as a large prime.gbf_traceback-1.0.tar.gz 77KB 09/2005 1.0 85d6342500c52aea718fe24b1a18aea8 This program simulates the proposed IP traceback scheme in an Internet-based topology. For each attack, one packet is sent towards the victim. The packet contains a GBF which stores the digests of the IP addresses of the traversed router. Once the packet reaches the victim, the reconstruction procedure starts. Two reconstruction procedures are implemented. In first one, each router tests its neighbors against the filter. When the router detects that a neighbor router is also in the filter, it forwards the GBF to that router so that the reconstruction procedure can proceed towards the attacker. The second procedure improves the false-negative ratio of the first procedure by using information from the previous routers in the reconstructed path. See [2] for details.