瀏覽代碼

Resolves: bug 283041
Bug Description: MMR: Directory updates on same object
Reviewed by: nhosoi (Thanks!)
Fix Description: The problem does appear to be concurrency. I think the original intention of
the urp fixup code was that it should only be run inside the database lock, so
that the database could be restored to a consistent state before the next
operation was processed. However, this requires the database code to know when
the database is already locked, so that if e.g. a modrdn operation needs to
call an internal delete, the database should not be locked again. The flag
OP_FLAG_REPL_FIXUP is used to denote both that the operation is such an
internal operation, and that the database should not be locked again.

There are a couple of cases where these operations can be called from outside
of the database lock:
urp_fixup_rename_entry is called from multimaster_postop_modrdn and
multimaster_postop_delete, both of which are front end post op plugins, not
called from within the database lock. Same with urp_fixup_delete_entry and
urp_fixup_modify_entry. In other cases, such as urp_fixup_add_entry, and other
places where urp_fixup_rename_entry and urp_fixup_modify_entry are called, they
are called from a bepostop plugin function, which is called after the original
database operation has been processed, within the database lock. So the
solution appears to be to move the urp_* functions to the bepostop plugin
functions. One of these functions does an internal search -
urp_get_min_naming_conflict_entry - but it does not appear that search locks
the database, so there was nothing to be done to make it "reentrant".

Without this patch, I can crash the server in a matter of minutes (x86_64
rhel5) using the latest Fedora DS 1.1 code. With the patch, the server runs
for several hours (maybe longer, I had to stop the test).

Also, to really exercise the urp code, I added a rename operation between the
add and delete e.g.
add("ou=test");
rename("ou=test", "ou=test2");
delete("ou=test2");
The server still runs for several hours with no problems.
Platforms tested: RHEL5 x86_64
Flag Day: no
Doc impact: no

Rich Megginson 18 年之前
父節點
當前提交
01061902f9
共有 1 個文件被更改,包括 16 次插入20 次删除
  1. 16 20
      ldap/servers/plugins/replication/repl5_plugins.c

+ 16 - 20
ldap/servers/plugins/replication/repl5_plugins.c

@@ -789,12 +789,26 @@ multimaster_bepreop_modrdn (Slapi_PBlock *pb)
 int 
 int 
 multimaster_bepostop_modrdn (Slapi_PBlock *pb)
 multimaster_bepostop_modrdn (Slapi_PBlock *pb)
 {
 {
+	Slapi_Operation *op;
+
+	slapi_pblock_get(pb, SLAPI_OPERATION, &op);
+	if ( ! operation_is_flag_set (op, OP_FLAG_REPL_FIXUP) )
+	{
+		urp_post_modrdn_operation (pb);
+	}
 	return 0;
 	return 0;
 }
 }
 
 
 int 
 int 
 multimaster_bepostop_delete (Slapi_PBlock *pb)
 multimaster_bepostop_delete (Slapi_PBlock *pb)
 {
 {
+	Slapi_Operation *op;
+
+	slapi_pblock_get(pb, SLAPI_OPERATION, &op);
+	if ( ! operation_is_flag_set (op, OP_FLAG_REPL_FIXUP) )
+	{
+		urp_post_delete_operation (pb);
+	}
 	return 0;
 	return 0;
 }
 }
 
 
@@ -814,16 +828,7 @@ multimaster_postop_add (Slapi_PBlock *pb)
 int 
 int 
 multimaster_postop_delete (Slapi_PBlock *pb)
 multimaster_postop_delete (Slapi_PBlock *pb)
 {
 {
-	int rc;
-	Slapi_Operation *op;
-
-	slapi_pblock_get(pb, SLAPI_OPERATION, &op);
-	if ( ! operation_is_flag_set (op, OP_FLAG_REPL_FIXUP) )
-	{
-		urp_post_delete_operation (pb);
-	}
-	rc = process_postop(pb);
-	return rc;
+	return process_postop(pb);
 }
 }
 
 
 int 
 int 
@@ -835,16 +840,7 @@ multimaster_postop_modify (Slapi_PBlock *pb)
 int 
 int 
 multimaster_postop_modrdn (Slapi_PBlock *pb)
 multimaster_postop_modrdn (Slapi_PBlock *pb)
 {
 {
-	int rc;
-	Slapi_Operation *op;
-
-	slapi_pblock_get(pb, SLAPI_OPERATION, &op);
-	if ( ! operation_is_flag_set (op, OP_FLAG_REPL_FIXUP) )
-	{
-		urp_post_modrdn_operation (pb);
-	}
-	rc = process_postop(pb);
-	return rc;
+	return process_postop(pb);
 }
 }