linux-toradex.git/include/linux/shm.h, branch v4.11-rc3

ipc/shm.c: is_file_shm_hugepages() can be boolean

2016-01-21T01:09:18+00:00

Make is_file_shm_hugepages() return bool to improve readability due to
this particular function only using either one or zero as its return
value.

No functional change.

Signed-off-by: Yaowei Bai 
Acked-by: Michal Hocko 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

shm: remove unneeded extern for function

2014-08-08T22:57:26+00:00

A small cleanup while changing adjacent code.  Extern is not needed for
functions and only one declaration had it so remove it from the odd line.

Signed-off-by: Milton Miller 
Signed-off-by: Jack Miller 
Cc: Davidlohr Bueso 
Cc: Manfred Spraul 
Cc: Anton Blanchard 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

shm: make exit_shm work proportional to task activity

2014-08-08T22:57:26+00:00

This is small set of patches our team has had kicking around for a few
versions internally that fixes tasks getting hung on shm_exit when there
are many threads hammering it at once.

Anton wrote a simple test to cause the issue:

  http://ozlabs.org/~anton/junkcode/bust_shm_exit.c

Before applying this patchset, this test code will cause either hanging
tracebacks or pthread out of memory errors.

After this patchset, it will still produce output like:

  root@somehost:~# ./bust_shm_exit 1024 160
  ...
  INFO: rcu_sched detected stalls on CPUs/tasks: {} (detected by 116, t=2111 jiffies, g=241, c=240, q=7113)
  INFO: Stall ended before state dump start
  ...

But the task will continue to run along happily, so we consider this an
improvement over hanging, even if it's a bit noisy.

This patch (of 3):

exit_shm obtains the ipc_ns shm rwsem for write and holds it while it
walks every shared memory segment in the namespace.  Thus the amount of
work is related to the number of shm segments in the namespace not the
number of segments that might need to be cleaned.

In addition, this occurs after the task has been notified the thread has
exited, so the number of tasks waiting for the ns shm rwsem can grow
without bound until memory is exausted.

Add a list to the task struct of all shmids allocated by this task.  Init
the list head in copy_process.  Use the ns->rwsem for locking.  Add
segments after id is added, remove before removing from id.

On unshare of NEW_IPCNS orphan any ids as if the task had exited, similar
to handling of semaphore undo.

I chose a define for the init sequence since its a simple list init,
otherwise it would require a function call to avoid include loops between
the semaphore code and the task struct.  Converting the list_del to
list_del_init for the unshare cases would remove the exit followed by
init, but I left it blow up if not inited.

Signed-off-by: Milton Miller 
Signed-off-by: Jack Miller 
Cc: Davidlohr Bueso 
Cc: Manfred Spraul 
Cc: Anton Blanchard 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

ipc/shm.c: increase the defaults for SHMALL, SHMMAX

2014-06-06T23:08:14+00:00

System V shared memory

a) can be abused to trigger out-of-memory conditions and the standard
   measures against out-of-memory do not work:

    - it is not possible to use setrlimit to limit the size of shm segments.

    - segments can exist without association with any processes, thus
      the oom-killer is unable to free that memory.

b) is typically used for shared information - today often multiple GB.
   (e.g. database shared buffers)

The current default is a maximum segment size of 32 MB and a maximum
total size of 8 GB.  This is often too much for a) and not enough for
b), which means that lots of users must change the defaults.

This patch increases the default limits (nearly) to the maximum, which
is perfect for case b).  The defaults are used after boot and as the
initial value for each new namespace.

Admins/distros that need a protection against a) should reduce the
limits and/or enable shm_rmid_forced.

Unix has historically required setting these limits for shared memory,
and Linux inherited such behavior.  The consequence of this is added
complexity for users and administrators.  One very common example are
Database setup/installation documents and scripts, where users must
manually calculate the values for these limits.  This also requires
(some) knowledge of how the underlying memory management works, thus
causing, in many occasions, the limits to just be flat out wrong.
Disabling these limits sooner could have saved companies a lot of time,
headaches and money for support.  But it's never too late, simplify
users life now.

Further notes:
- The patch only changes default, overrides behave as before:
        # sysctl kernel.shmall=33554432
  would recreate the previous limit for SHMMAX (for the current namespace).

- Disabling sysv shm allocation is possible with:
        # sysctl kernel.shmall=0
  (not a new feature, also per-namespace)

- The limits are intentionally set to a value slightly less than ULONG_MAX,
  to avoid triggering overflows in user space apps.
  [not unreasonable, see http://marc.info/?l=linux-mm&m=139638334330127]

Signed-off-by: Manfred Spraul 
Signed-off-by: Davidlohr Bueso 
Reported-by: Davidlohr Bueso 
Acked-by: Michael Kerrisk 
Acked-by: KOSAKI Motohiro 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

ipc: whitespace cleanup

2014-01-28T05:02:39+00:00

The ipc code does not adhere the typical linux coding style.
This patch fixes lots of simple whitespace errors.

- mostly autogenerated by
  scripts/checkpatch.pl -f --fix \
	--types=pointer_location,spacing,space_before_tab
- one manual fixup (keep structure members tab-aligned)
- removal of additional space_before_tab that were not found by --fix

Tested with some of my msg and sem test apps.

Andrew: Could you include it in -mm and move it towards Linus' tree?

Signed-off-by: Manfred Spraul 
Suggested-by: Li Bin 
Cc: Joe Perches 
Acked-by: Rafael Aquini 
Cc: Davidlohr Bueso 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

mm: support more pagesizes for MAP_HUGETLB/SHM_HUGETLB

2012-12-12T01:22:25+00:00

There was some desire in large applications using MAP_HUGETLB or
SHM_HUGETLB to use 1GB huge pages on some mappings, and stay with 2MB on
others.  This is useful together with NUMA policy: use 2MB interleaving
on some mappings, but 1GB on local mappings.

This patch extends the IPC/SHM syscall interfaces slightly to allow
specifying the page size.

It borrows some upper bits in the existing flag arguments and allows
encoding the log of the desired page size in addition to the *_HUGETLB
flag.  When 0 is specified the default size is used, this makes the
change fully compatible.

Extending the internal hugetlb code to handle this is straight forward.
Instead of a single mount it just keeps an array of them and selects the
right mount based on the specified page size.  When no page size is
specified it uses the mount of the default page size.

The change is not visible in /proc/mounts because internal mounts don't
appear there.  It also has very little overhead: the additional mounts
just consume a super block, but not more memory when not used.

I also exported the new flags to the user headers (they were previously
under __KERNEL__).  Right now only symbols for x86 and some other
architecture for 1GB and 2MB are defined.  The interface should already
work for all other architectures though.  Only architectures that define
multiple hugetlb sizes actually need it (that is currently x86, tile,
powerpc).  However tile and powerpc have user configurable hugetlb
sizes, so it's not easy to add defines.  A program on those
architectures would need to query sysfs and use the appropiate log2.

[akpm@linux-foundation.org: cleanups]
[rientjes@google.com: fix build]
[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Andi Kleen 
Cc: Michael Kerrisk 
Acked-by: Rik van Riel 
Acked-by: KAMEZAWA Hiroyuki 
Cc: Hillf Danton 
Signed-off-by: David Rientjes 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

UAPI: (Scripted) Disintegrate include/linux

2012-10-13T09:46:48+00:00

Signed-off-by: David Howells 
Acked-by: Arnd Bergmann 
Acked-by: Thomas Gleixner 
Acked-by: Michael Kerrisk 
Acked-by: Paul E. McKenney 
Acked-by: Dave Jones

ipc: add COMPAT_SHMLBA support

2012-07-31T00:25:20+00:00

If the SHMLBA definition for a native task differs from the definition for
a compat task, the do_shmat() function would need to handle both.

This patch introduces COMPAT_SHMLBA, which is used by the compat shmat
syscall when calling the ipc code and allows architectures such as AArch64
(where the native SHMLBA is 64k but the compat (AArch32) definition is
16k) to provide the correct semantics for compat IPC system calls.

Cc: David S. Miller 
Cc: Chris Zankel 
Cc: Arnd Bergmann 
Acked-by: Catalin Marinas 
Signed-off-by: Will Deacon 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

shm: handle separate PID namespaces case

2011-07-30T18:44:19+00:00

shm_try_destroy_orphaned() and shm_try_destroy_current() didn't handle
the case of separate PID namespaces, but a single IPC namespace.  If
there are tasks with the same PID values using the same shmem object,
the wrong destroy decision could be reached.

On shm segment creation store the pointer to the creator task in
shmid_kernel->shm_creator field and zero it on task exit.  Then
use the ->shm_creator insread of shm_cprid in both functions.  As
shmid_kernel object is already locked at this stage, no additional
locking is needed.

Signed-off-by: Vasiliy Kulikov 
Acked-by: Serge Hallyn 
Signed-off-by: Linus Torvalds

ipc: introduce shm_rmid_forced sysctl

2011-07-26T23:49:44+00:00

Add support for the shm_rmid_forced sysctl.  If set to 1, all shared
memory objects in current ipc namespace will be automatically forced to
use IPC_RMID.

The POSIX way of handling shmem allows one to create shm objects and
call shmdt(), leaving shm object associated with no process, thus
consuming memory not counted via rlimits.

With shm_rmid_forced=1 the shared memory object is counted at least for
one process, so OOM killer may effectively kill the fat process holding
the shared memory.

It obviously breaks POSIX - some programs relying on the feature would
stop working.  So set shm_rmid_forced=1 only if you're sure nobody uses
"orphaned" memory.  Use shm_rmid_forced=0 by default for compatability
reasons.

The feature was previously impemented in -ow as a configure option.

[akpm@linux-foundation.org: fix documentation, per Randy]
[akpm@linux-foundation.org: fix warning]
[akpm@linux-foundation.org: readability/conventionality tweaks]
[akpm@linux-foundation.org: fix shm_rmid_forced/shm_forced_rmid confusion, use standard comment layout]
Signed-off-by: Vasiliy Kulikov 
Cc: Randy Dunlap 
Cc: "Eric W. Biederman" 
Cc: "Serge E. Hallyn" 
Cc: Daniel Lezcano 
Cc: Oleg Nesterov 
Cc: Tejun Heo 
Cc: Ingo Molnar 
Cc: Alan Cox 
Cc: Solar Designer 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds