<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-toradex.git/net/ipv4/Makefile, branch v3.0.79</title>
<subtitle>Linux kernel for Apalis and Colibri modules</subtitle>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/'/>
<entry>
<title>net: ipv4: add IPPROTO_ICMP socket kind</title>
<updated>2011-05-13T20:08:13+00:00</updated>
<author>
<name>Vasiliy Kulikov</name>
<email>segoon@openwall.com</email>
</author>
<published>2011-05-13T10:01:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=c319b4d76b9e583a5d88d6bf190e079c4e43213d'/>
<id>c319b4d76b9e583a5d88d6bf190e079c4e43213d</id>
<content type='text'>
This patch adds IPPROTO_ICMP socket kind.  It makes it possible to send
ICMP_ECHO messages and receive the corresponding ICMP_ECHOREPLY messages
without any special privileges.  In other words, the patch makes it
possible to implement setuid-less and CAP_NET_RAW-less /bin/ping.  In
order not to increase the kernel's attack surface, the new functionality
is disabled by default, but is enabled at bootup by supporting Linux
distributions, optionally with restriction to a group or a group range
(see below).

Similar functionality is implemented in Mac OS X:
http://www.manpagez.com/man/4/icmp/

A new ping socket is created with

    socket(PF_INET, SOCK_DGRAM, PROT_ICMP)

Message identifiers (octets 4-5 of ICMP header) are interpreted as local
ports. Addresses are stored in struct sockaddr_in. No port numbers are
reserved for privileged processes, port 0 is reserved for API ("let the
kernel pick a free number"). There is no notion of remote ports, remote
port numbers provided by the user (e.g. in connect()) are ignored.

Data sent and received include ICMP headers. This is deliberate to:
1) Avoid the need to transport headers values like sequence numbers by
other means.
2) Make it easier to port existing programs using raw sockets.

ICMP headers given to send() are checked and sanitized. The type must be
ICMP_ECHO and the code must be zero (future extensions might relax this,
see below). The id is set to the number (local port) of the socket, the
checksum is always recomputed.

ICMP reply packets received from the network are demultiplexed according
to their id's, and are returned by recv() without any modifications.
IP header information and ICMP errors of those packets may be obtained
via ancillary data (IP_RECVTTL, IP_RETOPTS, and IP_RECVERR). ICMP source
quenches and redirects are reported as fake errors via the error queue
(IP_RECVERR); the next hop address for redirects is saved to ee_info (in
network order).

socket(2) is restricted to the group range specified in
"/proc/sys/net/ipv4/ping_group_range".  It is "1 0" by default, meaning
that nobody (not even root) may create ping sockets.  Setting it to "100
100" would grant permissions to the single group (to either make
/sbin/ping g+s and owned by this group or to grant permissions to the
"netadmins" group), "0 4294967295" would enable it for the world, "100
4294967295" would enable it for the users, but not daemons.

The existing code might be (in the unlikely case anyone needs it)
extended rather easily to handle other similar pairs of ICMP messages
(Timestamp/Reply, Information Request/Reply, Address Mask Request/Reply
etc.).

Userspace ping util &amp; patch for it:
http://openwall.info/wiki/people/segoon/ping

For Openwall GNU/*/Linux it was the last step on the road to the
setuid-less distro.  A revision of this patch (for RHEL5/OpenVZ kernels)
is in use in Owl-current, such as in the 2011/03/12 LiveCD ISOs:
http://mirrors.kernel.org/openwall/Owl/current/iso/

Initially this functionality was written by Pavel Kankovsky for
Linux 2.4.32, but unfortunately it was never made public.

All ping options (-b, -p, -Q, -R, -s, -t, -T, -M, -I), are tested with
the patch.

PATCH v3:
    - switched to flowi4.
    - minor changes to be consistent with raw sockets code.

PATCH v2:
    - changed ping_debug() to pr_debug().
    - removed CONFIG_IP_PING.
    - removed ping_seq_fops.owner field (unused for procfs).
    - switched to proc_net_fops_create().
    - switched to %pK in seq_printf().

PATCH v1:
    - fixed checksumming bug.
    - CAP_NET_RAW may not create icmp sockets anymore.

RFC v2:
    - minor cleanups.
    - introduced sysctl'able group range to restrict socket(2).

Signed-off-by: Vasiliy Kulikov &lt;segoon@openwall.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch adds IPPROTO_ICMP socket kind.  It makes it possible to send
ICMP_ECHO messages and receive the corresponding ICMP_ECHOREPLY messages
without any special privileges.  In other words, the patch makes it
possible to implement setuid-less and CAP_NET_RAW-less /bin/ping.  In
order not to increase the kernel's attack surface, the new functionality
is disabled by default, but is enabled at bootup by supporting Linux
distributions, optionally with restriction to a group or a group range
(see below).

Similar functionality is implemented in Mac OS X:
http://www.manpagez.com/man/4/icmp/

A new ping socket is created with

    socket(PF_INET, SOCK_DGRAM, PROT_ICMP)

Message identifiers (octets 4-5 of ICMP header) are interpreted as local
ports. Addresses are stored in struct sockaddr_in. No port numbers are
reserved for privileged processes, port 0 is reserved for API ("let the
kernel pick a free number"). There is no notion of remote ports, remote
port numbers provided by the user (e.g. in connect()) are ignored.

Data sent and received include ICMP headers. This is deliberate to:
1) Avoid the need to transport headers values like sequence numbers by
other means.
2) Make it easier to port existing programs using raw sockets.

ICMP headers given to send() are checked and sanitized. The type must be
ICMP_ECHO and the code must be zero (future extensions might relax this,
see below). The id is set to the number (local port) of the socket, the
checksum is always recomputed.

ICMP reply packets received from the network are demultiplexed according
to their id's, and are returned by recv() without any modifications.
IP header information and ICMP errors of those packets may be obtained
via ancillary data (IP_RECVTTL, IP_RETOPTS, and IP_RECVERR). ICMP source
quenches and redirects are reported as fake errors via the error queue
(IP_RECVERR); the next hop address for redirects is saved to ee_info (in
network order).

socket(2) is restricted to the group range specified in
"/proc/sys/net/ipv4/ping_group_range".  It is "1 0" by default, meaning
that nobody (not even root) may create ping sockets.  Setting it to "100
100" would grant permissions to the single group (to either make
/sbin/ping g+s and owned by this group or to grant permissions to the
"netadmins" group), "0 4294967295" would enable it for the world, "100
4294967295" would enable it for the users, but not daemons.

The existing code might be (in the unlikely case anyone needs it)
extended rather easily to handle other similar pairs of ICMP messages
(Timestamp/Reply, Information Request/Reply, Address Mask Request/Reply
etc.).

Userspace ping util &amp; patch for it:
http://openwall.info/wiki/people/segoon/ping

For Openwall GNU/*/Linux it was the last step on the road to the
setuid-less distro.  A revision of this patch (for RHEL5/OpenVZ kernels)
is in use in Owl-current, such as in the 2011/03/12 LiveCD ISOs:
http://mirrors.kernel.org/openwall/Owl/current/iso/

Initially this functionality was written by Pavel Kankovsky for
Linux 2.4.32, but unfortunately it was never made public.

All ping options (-b, -p, -Q, -R, -s, -t, -T, -M, -I), are tested with
the patch.

PATCH v3:
    - switched to flowi4.
    - minor changes to be consistent with raw sockets code.

PATCH v2:
    - changed ping_debug() to pr_debug().
    - removed CONFIG_IP_PING.
    - removed ping_seq_fops.owner field (unused for procfs).
    - switched to proc_net_fops_create().
    - switched to %pK in seq_printf().

PATCH v1:
    - fixed checksumming bug.
    - CAP_NET_RAW may not create icmp sockets anymore.

RFC v2:
    - minor cleanups.
    - introduced sysctl'able group range to restrict socket(2).

Signed-off-by: Vasiliy Kulikov &lt;segoon@openwall.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv4: Remove fib_hash.</title>
<updated>2011-02-01T23:35:25+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2011-02-01T23:15:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=3630b7c050d9c3564f143d595339fc06b888d6f3'/>
<id>3630b7c050d9c3564f143d595339fc06b888d6f3</id>
<content type='text'>
The time has finally come to remove the hash based routing table
implementation in ipv4.

FIB Trie is mature, well tested, and I've done an audit of it's code
to confirm that it implements insert, delete, and lookup with the same
identical semantics as fib_hash did.

If there are any semantic differences found in fib_trie, we should
simply fix them.

I've placed the trie statistic config option under advanced router
configuration.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Acked-by: Stephen Hemminger &lt;shemminger@vyatta.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The time has finally come to remove the hash based routing table
implementation in ipv4.

FIB Trie is mature, well tested, and I've done an audit of it's code
to confirm that it implements insert, delete, and lookup with the same
identical semantics as fib_hash did.

If there are any semantic differences found in fib_trie, we should
simply fix them.

I've placed the trie statistic config option under advanced router
configuration.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Acked-by: Stephen Hemminger &lt;shemminger@vyatta.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>PPTP: PPP over IPv4 (Point-to-Point Tunneling Protocol)</title>
<updated>2010-08-22T06:05:39+00:00</updated>
<author>
<name>Dmitry Kozlov</name>
<email>xeb@mail.ru</email>
</author>
<published>2010-08-22T06:05:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=00959ade36acadc00e757f87060bf6e4501d545f'/>
<id>00959ade36acadc00e757f87060bf6e4501d545f</id>
<content type='text'>
PPP: introduce "pptp" module which implements point-to-point tunneling protocol using pppox framework
NET: introduce the "gre" module for demultiplexing GRE packets on version criteria
     (required to pptp and ip_gre may coexists)
NET: ip_gre: update to use the "gre" module

This patch introduces then pptp support to the linux kernel which
dramatically speeds up pptp vpn connections and decreases cpu usage in
comparison of existing user-space implementation
(poptop/pptpclient). There is accel-pptp project
(https://sourceforge.net/projects/accel-pptp/) to utilize this module,
it contains plugin for pppd to use pptp in client-mode and modified
pptpd (poptop) to build high-performance pptp NAS.

There was many changes from initial submitted patch, most important are:
1. using rcu instead of read-write locks
2. using static bitmap instead of dynamically allocated
3. using vmalloc for memory allocation instead of BITS_PER_LONG + __get_free_pages
4. fixed many coding style issues
Thanks to Eric Dumazet.

Signed-off-by: Dmitry Kozlov &lt;xeb@mail.ru&gt;
Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
PPP: introduce "pptp" module which implements point-to-point tunneling protocol using pppox framework
NET: introduce the "gre" module for demultiplexing GRE packets on version criteria
     (required to pptp and ip_gre may coexists)
NET: ip_gre: update to use the "gre" module

This patch introduces then pptp support to the linux kernel which
dramatically speeds up pptp vpn connections and decreases cpu usage in
comparison of existing user-space implementation
(poptop/pptpclient). There is accel-pptp project
(https://sourceforge.net/projects/accel-pptp/) to utilize this module,
it contains plugin for pppd to use pptp in client-mode and modified
pptpd (poptop) to build high-performance pptp NAS.

There was many changes from initial submitted patch, most important are:
1. using rcu instead of read-write locks
2. using static bitmap instead of dynamically allocated
3. using vmalloc for memory allocation instead of BITS_PER_LONG + __get_free_pages
4. fixed many coding style issues
Thanks to Eric Dumazet.

Signed-off-by: Dmitry Kozlov &lt;xeb@mail.ru&gt;
Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>IPVS: Move IPVS to net/netfilter/ipvs</title>
<updated>2008-10-06T21:38:24+00:00</updated>
<author>
<name>Julius Volz</name>
<email>juliusv@google.com</email>
</author>
<published>2008-09-19T10:32:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=cb7f6a7b716e801097b564dec3ccb58d330aef56'/>
<id>cb7f6a7b716e801097b564dec3ccb58d330aef56</id>
<content type='text'>
Since IPVS now has partial IPv6 support, this patch moves IPVS from
net/ipv4/ipvs to net/netfilter/ipvs. It's a result of:

$ git mv net/ipv4/ipvs net/netfilter

and adapting the relevant Kconfigs/Makefiles to the new path.

Signed-off-by: Julius Volz &lt;juliusv@google.com&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since IPVS now has partial IPv6 support, this patch moves IPVS from
net/ipv4/ipvs to net/netfilter/ipvs. It's a result of:

$ git mv net/ipv4/ipvs net/netfilter

and adapting the relevant Kconfigs/Makefiles to the new path.

Signed-off-by: Julius Volz &lt;juliusv@google.com&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[IPV4]: Cleanup the sysctl_net_ipv4.c file</title>
<updated>2008-01-28T22:56:27+00:00</updated>
<author>
<name>Pavel Emelyanov</name>
<email>xemul@openvz.org</email>
</author>
<published>2007-12-05T09:38:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=9ba639797606acbcd97be886f41fbce163914e7b'/>
<id>9ba639797606acbcd97be886f41fbce163914e7b</id>
<content type='text'>
This includes several cleanups:

 * tune Makefile to compile out this file when SYSCTL=n. Now
   it looks like net/core/sysctl_net_core.c one;
 * move the ipv4_config to af_inet.c to exist all the time;
 * remove additional sysctl_ip_nonlocal_bind declaration
   (it is already declared in net/ip.h);
 * remove no nonger needed ifdefs from this file.

This is a preparation for using ctl paths for net/ipv4/
sysctl table.

Signed-off-by: Pavel Emelyanov &lt;xemul@openvz.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This includes several cleanups:

 * tune Makefile to compile out this file when SYSCTL=n. Now
   it looks like net/core/sysctl_net_core.c one;
 * move the ipv4_config to af_inet.c to exist all the time;
 * remove additional sysctl_ip_nonlocal_bind declaration
   (it is already declared in net/ip.h);
 * remove no nonger needed ifdefs from this file.

This is a preparation for using ctl paths for net/ipv4/
sysctl table.

Signed-off-by: Pavel Emelyanov &lt;xemul@openvz.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[INET]: Collect frag queues management objects together</title>
<updated>2007-10-15T19:26:39+00:00</updated>
<author>
<name>Pavel Emelyanov</name>
<email>xemul@openvz.org</email>
</author>
<published>2007-10-15T09:31:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=7eb95156d9dce2f59794264db336ce007d71638b'/>
<id>7eb95156d9dce2f59794264db336ce007d71638b</id>
<content type='text'>
There are some objects that are common in all the places
which are used to keep track of frag queues, they are:

 * hash table
 * LRU list
 * rw lock
 * rnd number for hash function
 * the number of queues
 * the amount of memory occupied by queues
 * secret timer

Move all this stuff into one structure (struct inet_frags)
to make it possible use them uniformly in the future. Like
with the previous patch this mostly consists of hunks like

-    write_lock(&amp;ipfrag_lock);
+    write_lock(&amp;ip4_frags.lock);

To address the issue with exporting the number of queues and
the amount of memory occupied by queues outside the .c file
they are declared in, I introduce a couple of helpers.

Signed-off-by: Pavel Emelyanov &lt;xemul@openvz.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There are some objects that are common in all the places
which are used to keep track of frag queues, they are:

 * hash table
 * LRU list
 * rw lock
 * rnd number for hash function
 * the number of queues
 * the amount of memory occupied by queues
 * secret timer

Move all this stuff into one structure (struct inet_frags)
to make it possible use them uniformly in the future. Like
with the previous patch this mostly consists of hunks like

-    write_lock(&amp;ipfrag_lock);
+    write_lock(&amp;ip4_frags.lock);

To address the issue with exporting the number of queues and
the amount of memory occupied by queues outside the .c file
they are declared in, I introduce a couple of helpers.

Signed-off-by: Pavel Emelyanov &lt;xemul@openvz.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[NET]: Generic Large Receive Offload for TCP traffic</title>
<updated>2007-10-10T23:47:46+00:00</updated>
<author>
<name>Jan-Bernd Themann</name>
<email>themann@de.ibm.com</email>
</author>
<published>2007-08-09T05:38:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=71c87e0cedca843162206c698cfa02e5fea9e2e3'/>
<id>71c87e0cedca843162206c698cfa02e5fea9e2e3</id>
<content type='text'>
This patch provides generic Large Receive Offload (LRO) functionality
for IPv4/TCP traffic.

LRO combines received tcp packets to a single larger tcp packet and
passes them then to the network stack in order to increase performance
(throughput). The interface supports two modes: Drivers can either
pass SKBs or fragment lists to the LRO engine.

Signed-off-by: Jan-Bernd Themann &lt;themann@de.ibm.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch provides generic Large Receive Offload (LRO) functionality
for IPv4/TCP traffic.

LRO combines received tcp packets to a single larger tcp packet and
passes them then to the network stack in order to increase performance
(throughput). The interface supports two modes: Drivers can either
pass SKBs or fragment lists to the LRO engine.

Signed-off-by: Jan-Bernd Themann &lt;themann@de.ibm.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[IPV4]: The scheduled removal of multipath cached routing support.</title>
<updated>2007-07-11T05:05:57+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@sunset.davemloft.net</email>
</author>
<published>2007-06-11T00:22:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=e06e7c615877026544ad7f8b309d1a3706410383'/>
<id>e06e7c615877026544ad7f8b309d1a3706410383</id>
<content type='text'>
With help from Chris Wedgwood.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
With help from Chris Wedgwood.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[IPV4]: Consolidate common SNMP code</title>
<updated>2007-04-26T05:29:51+00:00</updated>
<author>
<name>Herbert Xu</name>
<email>herbert@gondor.apana.org.au</email>
</author>
<published>2007-04-25T04:53:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=5e0f04351d11e07a23b5ab4914282cbb78027e50'/>
<id>5e0f04351d11e07a23b5ab4914282cbb78027e50</id>
<content type='text'>
This patch moves the SNMP code shared between IPv4/IPv6 from proc.c
into net/ipv4/af_inet.c.  This makes sense because these functions
aren't specific to /proc.

As a result we can again skip proc.o if /proc is disabled.

Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Acked-by: YOSHIFUJI Hideaki &lt;yoshfuji@linux-ipv6.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch moves the SNMP code shared between IPv4/IPv6 from proc.c
into net/ipv4/af_inet.c.  This makes sense because these functions
aren't specific to /proc.

As a result we can again skip proc.o if /proc is disabled.

Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Acked-by: YOSHIFUJI Hideaki &lt;yoshfuji@linux-ipv6.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[IPV4]: Fix build without procfs.</title>
<updated>2007-04-26T05:29:50+00:00</updated>
<author>
<name>YOSHIFUJI Hideaki</name>
<email>yoshfuji@linux-ipv6.org</email>
</author>
<published>2007-04-24T23:22:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.toradex.cn/cgit/linux-toradex.git/commit/?id=bb7ec6dfb5aa32b5b4d7d6388b4098b33cd01e8c'/>
<id>bb7ec6dfb5aa32b5b4d7d6388b4098b33cd01e8c</id>
<content type='text'>
Signed-off-by: YOSHIFUJI Hideaki &lt;yoshfuji@linux-ipv6.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: YOSHIFUJI Hideaki &lt;yoshfuji@linux-ipv6.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
