summaryrefslogtreecommitdiff
path: root/net/core/skbuff.c
AgeCommit message (Collapse)Author
2009-04-30net: Fix oops when splicing skbs from a frag_list.Jarek Poplawski
Lennert Buytenhek wrote: > Since 4fb669948116d928ae44262ab7743732c574630d ("net: Optimize memory > usage when splicing from sockets.") I'm seeing this oops (e.g. in > 2.6.30-rc3) when splicing from a TCP socket to /dev/null on a driver > (mv643xx_eth) that uses LRO in the skb mode (lro_receive_skb) rather > than the frag mode: My patch incorrectly assumed skb->sk was always valid, but for "frag_listed" skbs we can only use skb->sk of their parent. Reported-by: Lennert Buytenhek <buytenh@wantstofly.org> Debugged-by: Lennert Buytenhek <buytenh@wantstofly.org> Tested-by: Lennert Buytenhek <buytenh@wantstofly.org> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-28gso: Fix support for linear packetsHerbert Xu
When GRO/frag_list support was added to GSO, I made an error which broke the support for segmenting linear GSO packets (GSO packets are normally non-linear in the payload). These days most of these packets are constructed by the tun driver, which prefers to allocate linear memory if possible. This is fixed in the latest kernel, but for 2.6.29 and earlier it is still the norm. Therefore this bug causes failures with GSO when used with tun in 2.6.29. Reported-by: James Huang <jamesclhuang@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-13Network Drop Monitor: Adding kfree_skb_clean for non-drops and modifying ↵Neil Horman
end-of-line points for skbs Signed-off-by: Neil Horman <nhorman@tuxdriver.com> include/linux/skbuff.h | 4 +++- net/core/datagram.c | 2 +- net/core/skbuff.c | 22 ++++++++++++++++++++++ net/ipv4/arp.c | 2 +- net/ipv4/udp.c | 2 +- net/packet/af_packet.c | 2 +- 6 files changed, 29 insertions(+), 5 deletions(-) Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-26core: remove some pointless conditionals before kfree_skb()Wei Yongjun
Remove some pointless conditionals before kfree_skb(). Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-24Merge branch 'master' of /home/davem/src/GIT/linux-2.6/David S. Miller
2009-02-17net: Kill skb_truesize_check(), it only catches false-positives.David S. Miller
A long time ago we had bugs, primarily in TCP, where we would modify skb->truesize (for TSO queue collapsing) in ways which would corrupt the socket memory accounting. skb_truesize_check() was added in order to try and catch this error more systematically. However this debugging check has morphed into a Frankenstein of sorts and these days it does nothing other than catch false-positives. Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-15net: infrastructure for hardware time stampingPatrick Ohly
The additional per-packet information (16 bytes for time stamps, 1 byte for flags) is stored for all packets in the skb_shared_info struct. This implementation detail is hidden from users of that information via skb_* accessor functions. A separate struct resp. union is used for the additional information so that it can be stored/copied easily outside of skb_shared_info. Compared to previous implementations (reusing the tstamp field depending on the context, optional additional structures) this is the simplest solution. It does not extend sk_buff itself. TX time stamping is implemented in software if the device driver doesn't support hardware time stamping. The new semantic for hardware/software time stamping around ndo_start_xmit() is based on two assumptions about existing network device drivers which don't support hardware time stamping and know nothing about it: - they leave the new skb_shared_tx unmodified - the keep the connection to the originating socket in skb->sk alive, i.e., don't call skb_orphan() Given that skb_shared_tx is new, the first assumption is safe. The second is only true for some drivers. As a result, software TX time stamping currently works with the bnx2 driver, but not with the unmodified igb driver (the two drivers this patch series was tested with). Signed-off-by: Patrick Ohly <patrick.ohly@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-12net: Fix page seeking for skb_splice_bits().Jarek Poplawski
struct page walking should be done with proper accessor functions, not directly. With doubts from David S. Miller and Herbert Xu. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-10net: Move skbuff symbol exports after each symbol's definition.David S. Miller
net/core/skbuff.c is a hodge-podge of symbol export placement. Some of the exports are right after the definition of the symbol being exported, others are clumped together into a big group at the end of the file. Make things consistent. Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-05gro: Fix frag_list merging on imprecisely split packetsHerbert Xu
The previous fix ad0f9904444de1309dedd2b9e365cae8af77d9b1 (gro: Fix handling of imprecisely split packets) only fixed the case of frags merging, frag_list merging in the same circumstances were still broken. In particular, the packet headers end up in the data stream. This patch fixes this plus another issue where an imprecisely split packet header may be read incorrectly (this is mostly harmless since it'll simply cause the packet to not match and be rejected for GRO). Thanks to Emil Tantilov and Jeff Kirsher for helping to track this down. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-01net: Optimize memory usage when splicing from sockets.Jarek Poplawski
The recent fix of data corruption when splicing from sockets uses memory very inefficiently allocating a new page to copy each chunk of linear part of skb. This patch uses the same page until it's full (almost) by caching the page in sk_sndmsg_page field. With changes from David S. Miller <davem@davemloft.net> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-30Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/e1000/e1000_main.c
2009-01-29gro: Do not merge paged packets into frag_listHerbert Xu
gro: Do not merge paged packets into frag_list Bigger is not always better :) It was easy to continue to merged packets into frag_list after the page array is full. However, this turns out to be worse than LRO because frag_list is a much less efficient form of storage than the page array. So we're better off stopping the merge and starting a new entry with an empty page array. In future we can optimise this further by doing frag_list merging but making sure that we continue to fill in the page array. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-29gro: Avoid copying headers of unmerged packetsHerbert Xu
Unfortunately simplicity isn't always the best. The fraginfo interface turned out to be suboptimal. The problem was quite obvious. For every packet, we have to copy the headers from the frags structure into skb->head, even though for 99% of the packets this part is immediately thrown away after the merge. LRO didn't have this problem because it directly read the headers from the frags structure. This patch attempts to address this by creating an interface that allows GRO to access the headers in the first frag without having to copy it. Because all drivers that use frags place the headers in the first frag this optimisation should be enough. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-29net: Fix OOPS in skb_seq_read().Shyam Iyer
It oopsd for me in skb_seq_read. addr2line said it was linux-2.6/net/core/skbuff.c:2228, which is this line: while (st->frag_idx < skb_shinfo(st->cur_skb)->nr_frags) { I added some printks in there and it looks like we hit this: } else if (st->root_skb == st->cur_skb && skb_shinfo(st->root_skb)->frag_list) { st->cur_skb = skb_shinfo(st->root_skb)->frag_list; st->frag_idx = 0; goto next_skb; } Actually I did some testing and added a few printks and found that the st->cur_skb->data was 0 and hence the ptr used by iscsi_tcp was null. This caused the kernel panic. if (abs_offset < block_limit) { - *data = st->cur_skb->data + abs_offset; + *data = st->cur_skb->data + (abs_offset - st->stepped_offset); I enabled the debug_tcp and with a few printks found that the code did not go to the next_skb label and could find that the sequence being followed was this - It hit this if condition - if (st->cur_skb->next) { st->cur_skb = st->cur_skb->next; st->frag_idx = 0; goto next_skb; And so, now the st pointer is shifted to the next skb whereas actually it should have hit the second else if first since the data is in the frag_list. else if (st->root_skb == st->cur_skb && skb_shinfo(st->root_skb)->frag_list) { st->cur_skb = skb_shinfo(st->root_skb)->frag_list; goto next_skb; } Reversing the two conditions the attached patch fixes the issue for me on top of Herbert's patches. Signed-off-by: Shyam Iyer <shyam_iyer@dell.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-29net: Fix frag_list handling in skb_seq_readHerbert Xu
The frag_list handling was broken in skb_seq_read: 1) We didn't add the stepped offset when looking at the head are of fragments other than the first. 2) We didn't take the stepped offset away when setting the data pointer in the head area. 3) The frag index wasn't reset. This patch fixes both issues. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-20gro: Fix merging of paged packetsHerbert Xu
The previous fix to paged packets broke the merging because it reset the skb->len before we added it to the merged packet. This wasn't detected because it simply resulted in the truncation of the packet while the missing bit is subsequently retransmitted. The fix is to store skb->len before we clobber it. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-19net: Fix data corruption when splicing from sockets.Jarek Poplawski
The trick in socket splicing where we try to convert the skb->data into a page based reference using virt_to_page() does not work so well. The idea is to pass the virt_to_page() reference via the pipe buffer, and refcount the buffer using a SKB reference. But if we are splicing from a socket to a socket (via sendpage) this doesn't work. The from side processing will grab the page (and SKB) references. The sendpage() calls will grab page references only, return, and then the from side processing completes and drops the SKB ref. The page based reference to skb->data is not enough to keep the kmalloc() buffer backing it from being reused. Yet, that is all that the socket send side has at this point. This leads to data corruption if the skb->data buffer is reused by SLAB before the send side socket actually gets the TX packet out to the device. The fix employed here is to simply allocate a page and copy the skb->data bytes into that page. This will hurt performance, but there is no clear way to fix this properly without a copy at the present time, and it is important to get rid of the data corruption. With fixes from Herbert Xu. Tested-by: Willy Tarreau <w@1wt.eu> Foreseen-by: Changli Gao <xiaosuo@gmail.com> Diagnosed-by: Willy Tarreau <w@1wt.eu> Reported-by: Willy Tarreau <w@1wt.eu> Fixed-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-14gro: Fix page ref count for skbs freed normallyHerbert Xu
When an skb with page frags is merged into an existing one, we cannibalise its reference count. This is OK when the skb is reused because we set nr_frags to zero in that case. However, for the case where the skb is freed through kfree_skb, we didn't clear nr_frags which causes the page to be freed prematurely. This is fixed by moving the skb resetting into skb_gro_receive. Reported-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-04gro: Add page frag supportHerbert Xu
This patch allows GRO to merge page frags (skb_shinfo(skb)->frags) in one skb, rather than using the less efficient frag_list. It also adds a new interface, napi_gro_frags to allow drivers to inject page frags directly into the stack without allocating an skb. This is intended to be the GRO equivalent for LRO's lro_receive_frags interface. The existing GSO interface can already handle page frags with or without an appended frag_list so nothing needs to be changed there. The merging itself is rather simple. We store any new frag entries after the last existing entry, without checking whether the first new entry can be merged with the last existing entry. Making this check would actually be easy but since no existing driver can produce contiguous frags anyway it would just be mental masturbation. If the total number of entries would exceed the capacity of a single skb, we simply resort to using frag_list as we do now. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-04gro: Use gso_size to store MSSHerbert Xu
In order to allow GRO packets without frag_list at all, we need to store the MSS in the packet itself. The obvious place is gso_size. The only thing to watch out for is if the packet ends up not being GRO then we need to clear gso_size before pushing the packet into the stack. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-12-15net: Add skb_gro_receiveHerbert Xu
This patch adds the helper skb_gro_receive to merge packets for GRO. The current method is to allocate a new header skb and then chain the original packets to its frag_list. This is done to make it easier to integrate into the existing GSO framework. In future as GSO is moved into the drivers, we can undo this and simply chain the original packets together. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-12-15net: Add frag_list support to skb_segmentHerbert Xu
This patch adds limited support for handling frag_list packets in skb_segment. The intention is to support GRO (Generic Receive Offload) packets which will be constructed by chaining normal packets using frag_list. As such we require all frag_list members terminate on exact MSS boundaries. This is checked using BUG_ON. As there should only be one producer in the kernel of such packets, namely GRO, this requirement should not be difficult to maintain. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-26Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/hp-plus.c drivers/net/wireless/ath5k/base.c drivers/net/wireless/ath9k/recv.c net/wireless/reg.c
2008-11-25net: make skb_truesize_bug() call WARN()Arjan van de Ven
The truesize message check is important enough to make it print "BUG" to the user console... lets also make it important enough to spit a backtrace/module list etc so that kerneloops.org can track them. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25tcp: skb_shift cannot cache frag ptrs past pskb_expand_headIlpo Järvinen
Since pskb_expand_head creates copy of the shared area we cannot keep any frag ptr past de-cloning. This fixes the tcpdump recvfrom -EFAULT problem. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-24tcp: handle shift/merge of cloned skbs tooIlpo Järvinen
This caused me to get repeatably: tcpdump: pcap_loop: recvfrom: Bad address Happens occassionally when I tcpdump my for-looped test xfers: while [ : ]; do echo -n "$(date '+%s.%N') "; ./sendfile; sleep 20; done Rest of the relevant commands: ethtool -K eth0 tso off tc qdisc add dev eth0 root netem drop 4% tcpdump -n -s0 -i eth0 -w sacklog.all Running net-next under kvm, connection goes to the same host (basically just out of kvm). The connection itself works ok and data gets sent without corruption even with a large number of tests while tcpdump fails usually within less than 5 tests. Whether it only happens because of this change or not, I don't know for sure but it's the only thing with which I've seen that error. The non-cloned variant works w/o it for much longer time. I'm yet to debug where the error actually comes from. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-24tcp: Try to restore large SKBs while SACK processingIlpo Järvinen
During SACK processing, most of the benefits of TSO are eaten by the SACK blocks that one-by-one fragment SKBs to MSS sized chunks. Then we're in problems when cleanup work for them has to be done when a large cumulative ACK comes. Try to return back to pre-split state already while more and more SACK info gets discovered by combining newly discovered SACK areas with the previous skb if that's SACKed as well. This approach has a number of benefits: 1) The processing overhead is spread more equally over the RTT 2) Write queue has less skbs to process (affect everything which has to walk in the queue past the sacked areas) 3) Write queue is consistent whole the time, so no other parts of TCP has to be aware of this (this was not the case with some other approach that was, well, quite intrusive all around). 4) Clean_rtx_queue can release most of the pages using single put_page instead of previous PAGE_SIZE/mss+1 calls In case a hole is fully filled by the new SACK block, we attempt to combine the next skb too which allows construction of skbs that are even larger than what tso split them to and it handles hole per on every nth patterns that often occur during slow start overshoot pretty nicely. Though this to be really useful also a retransmission would have to get lost since cumulative ACKs advance one hole at a time in the most typical case. TODO: handle upwards only merging. That should be rather easy when segment is fully sacked but I'm leaving that as future work item (it won't make very large difference anyway since this current approach already covers quite a lot of normal cases). I was earlier thinking of some sophisticated way of tracking timestamps of the first and the last segment but later on realized that it won't be that necessary at all to store the timestamp of the last segment. The cases that can occur are basically either: 1) ambiguous => no sensible measurement can be taken anyway 2) non-ambiguous is due to reordering => having the timestamp of the last segment there is just skewing things more off than does some good since the ack got triggered by one of the holes (besides some substle issues that would make determining right hole/skb even harder problem). Anyway, it has nothing to do with this change then. I choose to route some abnormal looking cases with goto noop, some could be handled differently (eg., by stopping the walking at that skb but again). In general, they either shouldn't happen at all or are rare enough to make no difference in practice. In theory this change (as whole) could cause some macroscale regression (global) because of cache misses that are taken over the round-trip time but it gets very likely better because of much less (local) cache misses per other write queue walkers and the big recovery clearing cumulative ack. Worth to note that these benefits would be very easy to get also without TSO/GSO being on as long as the data is in pages so that we can merge them. Currently I won't let that happen because DSACK splitting at fragment that would mess up pcounts due to sk_can_gso in tcp_set_skb_tso_segs. Once DSACKs fragments gets avoided, we have some conditions that can be made less strict. TODO: I will probably have to convert the excessive pointer passing to struct sacktag_state... :-) My testing revealed that considerable amount of skbs couldn't be shifted because they were cloned (most likely still awaiting tx reclaim)... [The rest is considering future work instead since I got repeatably EFAULT to tcpdump's recvfrom when I added pskb_expand_head to deal with clones, so I separated that into another, later patch] ...To counter that, I gave up on the fifth advantage: 5) When growing previous SACK block, less allocs for new skbs are done, basically a new alloc is needed only when new hole is detected and when the previous skb runs out of frags space ...which now only happens of if reclaim is fast enough to dispose the clone before the SACK block comes in (the window is RTT long), otherwise we'll have to alloc some. With clones being handled I got these numbers (will be somewhat worse without that), taken with fine-grained mibs: TCPSackShifted 398 TCPSackMerged 877 TCPSackShiftFallback 320 TCPSACKCOLLAPSEFALLBACKGSO 0 TCPSACKCOLLAPSEFALLBACKSKBBITS 0 TCPSACKCOLLAPSEFALLBACKSKBDATA 0 TCPSACKCOLLAPSEFALLBACKBELOW 0 TCPSACKCOLLAPSEFALLBACKFIRST 1 TCPSACKCOLLAPSEFALLBACKPREVBITS 318 TCPSACKCOLLAPSEFALLBACKMSS 1 TCPSACKCOLLAPSEFALLBACKNOHEAD 0 TCPSACKCOLLAPSEFALLBACKSHIFT 0 TCPSACKCOLLAPSENOOPSEQ 0 TCPSACKCOLLAPSENOOPSMALLPCOUNT 0 TCPSACKCOLLAPSENOOPSMALLLEN 0 TCPSACKCOLLAPSEHOLE 12 Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-11Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/message/fusion/mptlan.c drivers/net/sfc/ethtool.c net/mac80211/debugfs_sta.c
2008-11-10net: fix setting of skb->tail in skb_recycle_check()Lennert Buytenhek
Since skb_reset_tail_pointer() reads skb->data, we need to set skb->data before calling skb_reset_tail_pointer(). This was causing spurious skb_over_panic()s from skb_put() being called on a recycled skb that had its skb->tail set to beyond where it should have been. Bug report from Peter van Valderen <linux@ddcrew.com>. Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-06Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/ath5k/base.c net/8021q/vlan_core.c
2008-11-01net: add documentation for skb recyclingStephen Hemminger
Commit 04a4bb55bcf35b63d40fd2725e58599ff8310dd7 ("net: add skb_recycle_check() to enable netdriver skb recycling") added a method for network drivers to recycle skbuffs, but while use of this mechanism was documented in the commit message, it should really have been added as a docbook comment as well -- this patch does that. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-31mac80211: Re-enable aggregationSujith
Wireless HW without any dedicated queues for aggregation do not need the ampdu_queues mechanism present right now in mac80211. Since mac80211 is still incomplete wrt TX MQ changes, do not allow aggregation sessions for drivers that set ampdu_queues. This is only an interim hack until Intel fixes the requeue issue. Signed-off-by: Sujith <Sujith.Manoharan@atheros.com> Signed-off-by: Luis Rodriguez <Luis.Rodriguez@Atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-10-28net: reduce structures when XFRM=nAlexey Dobriyan
ifdef out * struct sk_buff::sp (pointer) * struct dst_entry::xfrm (pointer) * struct sock::sk_policy (2 pointers) Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-13net: Rationalise email address: Network Specific PartsAlan Cox
Clean up the various different email addresses of mine listed in the code to a single current and valid address. As Dave says his network merges for 2.6.28 are now done this seems a good point to send them in where they won't risk disrupting real changes. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-07net: packet split receive apiPeter Zijlstra
Add some packet-split receive hooks. For one this allows to do NUMA node affine page allocs. Later on these hooks will be extended to do emergency reserve allocations for fragments. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-01net: BUG instead of corrupting memory in pskb_expand_headHerbert Xu
If the caller of pskb_expand_head specifies a negative nhead we'll silently overwrite other people's memory. This patch makes it BUG instead. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-01net: add skb_recycle_check() to enable netdriver skb recyclingLennert Buytenhek
This patch adds skb_recycle_check(), which can be used by a network driver after transmitting an skb to check whether this skb can be recycled as a receive buffer. skb_recycle_check() checks that the skb is not shared or cloned, and that it is linear and its head portion large enough (as determined by the driver) to be recycled as a receive buffer. If these conditions are met, it does any necessary reference count dropping and cleans up the skbuff as if it just came from __alloc_skb(). Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-08-15net: Preserve netfilter attributes in skb_gso_segment using __copy_skb_headerHerbert Xu
skb_gso_segment didn't preserve some attributes in the original skb such as the netfilter fields. This was harmless until they were used which is the case for packets going through lo. This patch makes it call __copy_skb_header which also picks up some other missing attributes. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-29mac80211: partially fix skb->cb useJohannes Berg
This patch fixes mac80211 to not use the skb->cb over the queue step from virtual interfaces to the master. The patch also, for now, disables aggregation because that would still require requeuing, will fix that in a separate patch. There are two other places (software requeue and powersaving stations) where requeue can happen, but that is not currently used by any drivers/not possible to use respectively. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-07-25net: convert BUG_TRAP to generic WARN_ONIlpo Järvinen
Removes legacy reinvent-the-wheel type thing. The generic machinery integrates much better to automated debugging aids such as kerneloops.org (and others), and is unambiguous due to better naming. Non-intuively BUG_TRAP() is actually equal to WARN_ON() rather than BUG_ON() though some might actually be promoted to BUG_ON() but I left that to future. I could make at least one BUILD_BUG_ON conversion. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16net: make __skb_splice_bits staticHarvey Harrison
net/core/skbuff.c:1335:5: warning: symbol '__skb_splice_bits' was not declared. Should it be static? Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-15net: refactor tcp splice receive path to improve readabilityOctavian Purdila
- move all of the details on offsets, lengths and buffers into a single function instead of doing these operation from multiple places - use a bottom up approach: try to avoid details in the high level functions, introduce them gradually as we go deeper in the function call stack With helpful feedback from Jarek Poplawski. Signed-off-by: Octavian Purdila <opurdila@ixiacom.com> Acked-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-14vlan: Don't store VLAN tag in cbPatrick McHardy
Use a real skb member to store the skb to avoid clashes with qdiscs, which are allowed to use the cb area themselves. As currently only real devices that consume the skb set the NETIF_F_HW_VLAN_TX flag, no explicit invalidation is neccessary. The new member fills a hole on 64 bit, the skb layout changes from: __u32 mark; /* 172 4 */ sk_buff_data_t transport_header; /* 176 4 */ sk_buff_data_t network_header; /* 180 4 */ sk_buff_data_t mac_header; /* 184 4 */ sk_buff_data_t tail; /* 188 4 */ /* --- cacheline 3 boundary (192 bytes) --- */ sk_buff_data_t end; /* 192 4 */ /* XXX 4 bytes hole, try to pack */ to __u32 mark; /* 172 4 */ __u16 vlan_tci; /* 176 2 */ /* XXX 2 bytes hole, try to pack */ sk_buff_data_t transport_header; /* 180 4 */ sk_buff_data_t network_header; /* 184 4 */ Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-28Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/iwlwifi/iwl4965-base.c
2008-06-27tcp: fix for splice receive when used with software LROOctavian Purdila
If an skb has nr_frags set to zero but its frag_list is not empty (as it can happen if software LRO is enabled), and a previous tcp_read_sock has consumed the linear part of the skb, then __skb_splice_bits: (a) incorrectly reports an error and (b) forgets to update the offset to account for the linear part Any of the two problems will cause the subsequent __skb_splice_bits call (the one that handles the frag_list skbs) to either skip data, or, if the unadjusted offset is greater then the size of the next skb in the frag_list, make tcp_splice_read loop forever. Signed-off-by: Octavian Purdila <opurdila@ixiacom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-19net: Discard and warn about LRO'd skbs received for forwardingBen Hutchings
Add skb_warn_if_lro() to test whether an skb was received with LRO and warn if so. Change br_forward(), ip_forward() and ip6_forward() to call it) and discard the skb if it returns true. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-11net: remove CVS keywordsAdrian Bunk
This patch removes CVS keywords that weren't updated for a long time from comments. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-04tcp: Fix for race due to temporary drop of the socket lock in skb_splice_bits.Octavian Purdila
skb_splice_bits temporary drops the socket lock while iterating over the socket queue in order to break a reverse locking condition which happens with sendfile. This, however, opens a window of opportunity for tcp_collapse() to aggregate skbs and thus potentially free the current skb used in skb_splice_bits and tcp_read_sock. This patch fixes the problem by (re-)getting the same "logical skb" after the lock has been temporary dropped. Based on idea and initial patch from Evgeniy Polyakov. Signed-off-by: Octavian Purdila <opurdila@ixiacom.com> Acked-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-05-03net: Fix useless comment reference loop.Johannes Berg
include/linux/skbuff.h says: /* These elements must be at the end, see alloc_skb() for details. */ net/core/skbuff.c says: * See comment in sk_buff definition, just before the 'tail' member This patch contains my guess as to the actual reason rather than a dead comment reference loop. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>