summaryrefslogtreecommitdiff
path: root/fs/ceph/osdmap.c
AgeCommit message (Collapse)Author
2010-03-23ceph: fix pg pool decoding from incremental osdmap updateSage Weil
The incremental map decoding of pg pool updates wasn't skipping the snaps and removed_snaps vectors. This caused osd requests to stall when pool snapshots were created or fs snapshots were deleted. Use a common helper for full and incremental map decoders that decodes pools properly. Signed-off-by: Sage Weil <sage@newdream.net>
2010-03-01ceph: fix osdmap decoding when pools include (removed) snapsSage Weil
Add missing pointer dereference (p is a void **). Signed-off-by: Sage Weil <sage@newdream.net>
2010-02-17ceph: use rbtree for pg pools; decode new osdmap formatSage Weil
Since we can now create and destroy pg pools, the pool ids will be sparse, and an array no longer makes sense for looking up by pool id. Use an rbtree instead. The OSDMap encoding also no longer has a max pool count (previously used to allocate the array). There is a new pool_max, that is the largest pool id we've ever used, although we don't actually need it in the client. Signed-off-by: Sage Weil <sage@newdream.net>
2010-02-17ceph: fix memory leak when destroying osdmap with pg_temp mappingsSage Weil
Also move _lookup_pg_mapping into a helper. Signed-off-by: Sage Weil <sage@newdream.net>
2010-02-11ceph: add uid field to ceph_pg_poolSage Weil
Also verify encoding version as we go. Signed-off-by: Sage Weil <sage@newdream.net>
2010-01-25ceph: precede encoded ceph_pg_pool struct with versionSage Weil
Signed-off-by: Sage Weil <sage@newdream.net>
2009-12-21ceph: fix incremental osdmap pg_temp decoding bugSage Weil
An incremental pg_temp wasn't being decoded properly (wrong bound on for loop). Also remove unused local variable, while we're at it. Signed-off-by: Sage Weil <sage@newdream.net>
2009-12-21ceph: fix error paths for corrupt osdmap messagesSage Weil
Both osdmap_decode() and osdmap_apply_incremental() should never return NULL. Signed-off-by: Sage Weil <sage@newdream.net>
2009-12-21ceph: hex dump corrupt server data to KERN_DEBUGSage Weil
Also, print fsid using standard format, NOT hex dump. Signed-off-by: Sage Weil <sage@newdream.net>
2009-12-09ceph: do not feed bad device ids to crushSage Weil
Do not feed bad (large) device ids to CRUSH. Signed-off-by: Sage Weil <sage@newdream.net>
2009-11-07ceph: make CRUSH hash function a bucket propertySage Weil
Make the integer hash function a property of the bucket it is used on. This allows us to gracefully add support for new hash functions without starting from scatch. Signed-off-by: Sage Weil <sage@newdream.net>
2009-11-06ceph: make object hash a pg_pool propertySage Weil
The object will be hashed to a placement seed (ps) based on the pg_pool's hash function. This allows new hashes to be introduced into an existing object store, or selection of a hash appropriate to the objects that will be stored in a particular pool. Signed-off-by: Sage Weil <sage@newdream.net>
2009-11-06ceph: clean up 'osd%d down' console msgSage Weil
No ceph prefix. Signed-off-by: Sage Weil <sage@newdream.net>
2009-11-04ceph: fix endian conversions for ceph_pgSage Weil
The endian conversions don't quite work with the old union ceph_pg. Just make it a regular struct, and make each field __le. This is simpler and it has the added bonus of actually working. Signed-off-by: Sage Weil <sage@newdream.net>
2009-11-03ceph: use fixed endian encoding for ceph_entity_addrSage Weil
We exchange struct ceph_entity_addr over the wire and store it on disk. The sockaddr_storage.ss_family field, however, is host endianness. So, fix ss_family endianness to big endian when sending/receiving over the wire. Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-30ceph: fix intra strip unit length calculationNoah Watkins
Commit 645a102581b3639836b17d147c35d574fd6e8267 fixes calculation of object offset for layouts with multiple stripes per object. This updates the calculation of the length written to take into account multiple stripes per object. Signed-off-by: Noah Watkins <noah@noahdesu.com> Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-28ceph: fix object striping calculation for non-default striping schemesSage Weil
We were incorrectly calculationing of object offset. If we have multiple stripe units per object, we need to shift to the start of the current su in addition to the offset within the su. Also rename bno to ono (object number) to avoid some variable naming confusion. Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-28ceph: correct comment to match striping calculationSage Weil
The object extent offset is the file offset _modulo_ the stripe unit. The code was correct, the comment was wrong. Reported-by: Noah Watkins <jayhawk@soe.ucsc.edu> Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-28ceph: remove redundant use of le32_to_cpuNoah Watkins
Using stripe unit size calculated and saved on the stack to avoid a redundant call to le32_to_cpu. Signed-off-by: Noah Watkins <noah@noahdesu.com> Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-19ceph: include preferred osd in placement seedSage Weil
Mix the preferred osd (if any) into the placement seed that is fed into the CRUSH object placement calculation. This prevents all the placement pgs from peering with the same osds. Rev the osd client protocol with this change. Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-14ceph: convert encode/decode macros to inlinesSage Weil
This avoids the fugly pass by reference and makes the code a bit easier to read. Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-09ceph: fail gracefully on corrupt osdmap (bad pg_temp mapping)Sage Weil
Return an error and report a corrupt map instead of crying BUG(). Signed-off-by: Sage Weil <sage@newdream.net>
2009-10-06ceph: OSD clientSage Weil
The OSD client is responsible for reading and writing data from/to the object storage pool. This includes determining where objects are stored in the cluster, and ensuring that requests are retried or redirected in the event of a node failure or data migration. If an OSD does not respond before a timeout expires, keepalive messages are sent across the lossless, ordered communications channel to ensure that any break in the TCP is discovered. If the session does reset, a reconnection is attempted and affected requests are resent (by the message transport layer). Signed-off-by: Sage Weil <sage@newdream.net>