Nothing Special   »   [go: up one dir, main page]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lazy objects: foreach edge cases #15960

Closed
wants to merge 4 commits into from

Conversation

arnaud-lb
Copy link
Member
@arnaud-lb arnaud-lb commented Sep 19, 2024

This fixes a few edge cases around foreach on lazy objects of non-iterable classes:

  • Deny resetAsLazy() on objects being iterated, as it complicates things and it makes no sense to reset an object during iteration. Moreover I'm not sure we can have a consistent behavior between ghosts and proxies (e.g. we may conserve the iteration position after reset for ghosts, but for proxies it seems difficult).
  • foreach() doesn't always use the get_properties handler, so lazy objects with a properties ht would not be initialized.
  • Iterating over proxies whose real instance is also a proxy was broken

To detect that an object is being iterated I rely on the fact that FE_RESET creates a hash iterator, so object->properties->u.v.nIteratorsCount is non-zero. However the get_iterator implementation for hooks does not initialize object->properties, and does not immediately create a hash iterator. Here I change it so that it always initializes object->properties and immediately creates a hash iterator.

/cc @iluuu1994

Depends on GH-15825.

 - Fix foreach on a object whose initialization failed before
 - Do not allow to reset an object during iteration
@arnaud-lb arnaud-lb marked this pull request as ready for review September 19, 2024 17:17
Comment on lines +170 to +172
if (UNEXPECTED(Z_TYPE(bucket->val) == IS_UNDEF)) {
return;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This happens when a dynamic property is unset:

<?php

#[AllowDynamicProperties]
class C {
    public int $a { &get { $ref = 1; return $ref; } }
}

$c = new C();
$c->dyn0 = 0;
$c->dyn1 = 1;
unset($c->dyn0);

foreach ($c as $k => &$v) {
    var_dump($k);
}

Comment on lines +6824 to +6831
if (UNEXPECTED(zend_object_is_lazy(zobj))) {
zobj = zend_lazy_object_init(zobj);
if (UNEXPECTED(EG(exception))) {
UNDEF_RESULT();
FREE_OP1_IF_VAR();
HANDLE_EXCEPTION();
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally we would rely on ->get_properties() to do the right thing, but ZEND_FE_RESET_R proceeds to separate object->properties after that.

@arnaud-lb arnaud-lb changed the base branch from master to PHP-8.4 September 25, 2024 11:15
Copy link
Member
@iluuu1994 iluuu1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay.


static bool zlo_is_iterating(zend_object *object)
{
if (object->properties && object->properties->u.v.nIteratorsCount) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use HT_ITERATORS_COUNT()?

@@ -278,9 +303,17 @@ ZEND_API zend_object *zend_object_make_lazy(zend_object *obj,

GC_DEL_FLAGS(obj, IS_OBJ_DESTRUCTOR_CALLED);

/* unset() dynamic properties */
zend_object_dtor_dynamic_properties(obj);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why this change is necessary. Reverting it to the previous code doesn't hit a test failure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not fully related to foreach, and I should probably commit this change separately, but I realized that setting properties to null here could break some assumptions. For instance, we have code like this that would break if we did it:

if (!obj->properties) {
    rebuild_properties(obj);
}
do_something();
obj->properties->..

So, just removing the dynamic properties from the ht is safer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't that also mean you're potentially accessing the wrong properties?

Copy link
Member Author
@arnaud-lb arnaud-lb Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More or less. If do_something() resets the object, the last line could potentially access a property of the object after it has been reset (the right property in the same object, but it has been reset). In the worse case this would allow to observe the state of a non-initialized lazy object.

I've made this change just in case, but it should not happen in practice because this kind of code is unsafe (before lazy objects). Normally we use the result of rebuild_properties() / get_properties() immediately without effects in between (except in # 15938 and maybe in ArrayObject). Outside of the VM and object handlers we use get_properies_for(), which doesn't have this problem because it increases the refcount.

NB: For proxies, this kind of code would be worse:

ht = get_properties(object)
do_something()
ht->...

because ht may belong to an other object, which may be released during reset. However this code is unsafe before lazy objects because ht can be released due to CoW separation. E.g.:

ht = get_properties(object)
ht' = convert_to_array(object) // ADDREF(ht), ht' === ht
foreach(object); // separates ht, so object->ht is a duplicate now and ht is RC=1
dtor(ht') // releases ht/ht'
ht->... // UB 

So we shouldn't have code like this.

@@ -513,6 +546,9 @@ ZEND_API zend_object *zend_lazy_object_init(zend_object *obj)
ZEND_ASSERT(zend_object_is_lazy_proxy(obj));
zend_lazy_object_info *info = zend_lazy_object_get_info(obj);
ZEND_ASSERT(info->flags & ZEND_LAZY_OBJECT_INITIALIZED);
if (zend_object_is_lazy(info->u.instance)) {
return zend_lazy_object_init(info->u.instance);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it a bit surprising that we allow nesting lazy proxies. I suppose this is not preventable? If some initialized lazy proxy should become lazy again, I would expect resetAsLazyProxy to be called on the proxy, rather than the backed object. If this doesn't cause any other issues, then I suppose it's fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this can happen when resetting the real instance, e.g.:

$real = new C();
$proxy = $r->newLazyProxy(function () use ($real) {
    return $real;
});
$r->resetAsLazyProxy($real, ...);

It would be possible to prevent this, but there may be valid use-cases where the caller of resetAsLazy() ignores that the object is the real instance of some proxy.

One possible issue would be cycles of proxies, but this is prevented as we don't allow to return a proxy from the factory.

@@ -252,14 +254,11 @@ static void zho_it_move_forward(zend_object_iterator *iter)
if (zend_hash_has_more_elements(properties) != SUCCESS) {
hooked_iter->declared_props_done = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also move this check to stay consistent between declared and dynamic props?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried, but it's not trivial. If zho_it_move_forward() doesn't mark declared_props_done immediately we will call zho_declared_it_fetch_current(), and zho_it_move_forward() again. But this time we will increment the dynamic position and skip the first dynamic property.

We would also get similar issues if the client called ->move_forward() a few times without calling ->get_current().

We could simplify how we chose between declared and dynamic props. E.g. use declared ones when declared_props.nInternalPointer < declared_props.nNumUsed, and dynamic ones otherwise. We can then remove declared_props_done / dynamic_props_done:

diff --git a/Zend/zend_property_hooks.c b/Zend/zend_property_hooks.c
index d701abdc3c5..c9d52f18c9f 100644
--- a/Zend/zend_property_hooks.c
+++ b/Zend/zend_property_hooks.c
@@ -25,9 +25,7 @@
 typedef struct {
 	zend_object_iterator it;
 	bool by_ref;
-	bool declared_props_done;
 	zval declared_props;
-	bool dynamic_props_done;
 	uint32_t dynamic_prop_it;
 	zval current_key;
 	zval current_data;
@@ -100,7 +98,6 @@ static void zho_dynamic_it_init(zend_hooked_object_iterator *hooked_iter)
 {
 	zend_object *zobj = Z_OBJ_P(&hooked_iter->it.data);
 	zend_array *properties = zobj->handlers->get_properties(zobj);
-	hooked_iter->dynamic_props_done = false;
 	hooked_iter->dynamic_prop_it = zend_hash_iterator_add(properties, zho_num_backed_props(zobj));
 }
 
@@ -161,7 +158,6 @@ static void zho_dynamic_it_fetch_current(zend_object_iterator *iter)
 	HashPosition pos = zend_hash_iterator_pos(hooked_iter->dynamic_prop_it, properties);
 
 	if (pos >= properties->nNumUsed) {
-		hooked_iter->dynamic_props_done = true;
 		return;
 	}
 
@@ -191,10 +187,12 @@ static void zho_it_fetch_current(zend_object_iterator *iter)
 		return;
 	}
 
+	zend_array *properties = Z_ARR(hooked_iter->declared_props);
+
 	while (true) {
-		if (!hooked_iter->declared_props_done) {
+		if (properties->nInternalPointer < properties->nNumUsed) {
 			zho_declared_it_fetch_current(iter);
-		} else if (!hooked_iter->dynamic_props_done) {
+		} else if (zend_hash_iterator_pos(hooked_iter->dynamic_prop_it, Z_OBJ(iter->data)->properties) < Z_OBJ(iter->data)->properties->nNumUsed) {
 			zho_dynamic_it_fetch_current(iter);
 		} else {
 			break;
@@ -248,13 +246,10 @@ static void zho_it_move_forward(zend_object_iterator *iter)
 	zval_ptr_dtor_nogc(&hooked_iter->current_key);
 	ZVAL_UNDEF(&hooked_iter->current_key);
 
-	if (!hooked_iter->declared_props_done) {
-		zend_array *properties = Z_ARR(hooked_iter->declared_props);
-		zend_hash_move_forward(properties);
-		if (zend_hash_has_more_elements(properties) != SUCCESS) {
-			hooked_iter->declared_props_done = true;
-		}
-	} else if (!hooked_iter->dynamic_props_done) {
+	zend_array *properties = Z_ARR(hooked_iter->declared_props);
+	if (properties->nInternalPointer < properties->nNumUsed) {
+		properties->nInternalPointer++;
+	} else {
 		zend_array *properties = Z_OBJ(iter->data)->properties;
 		HashPosition pos = zend_hash_iterator_pos(hooked_iter->dynamic_prop_it, properties);
 		pos++;
@@ -271,13 +266,9 @@ static void zho_it_rewind(zend_object_iterator *iter)
 	zval_ptr_dtor_nogc(&hooked_iter->current_key);
 	ZVAL_UNDEF(&hooked_iter->current_key);
 
-	hooked_iter->declared_props_done = false;
 	zend_array *properties = Z_ARR(hooked_iter->declared_props);
 	zend_hash_internal_pointer_reset(properties);
-	hooked_iter->dynamic_props_done = false;
-	if (hooked_iter->dynamic_prop_it != (uint32_t) -1) {
-		EG(ht_iterators)[hooked_iter->dynamic_prop_it].pos = zho_num_backed_props(Z_OBJ(iter->data));
-	}
+	EG(ht_iterators)[hooked_iter->dynamic_prop_it].pos = zho_num_backed_props(Z_OBJ(iter->data));
 }
 
 static HashTable *zho_it_get_gc(zend_object_iterator *iter, zval **table, int *n)
@@ -318,7 +309,6 @@ ZEND_API zend_object_iterator *zend_hooked_object_get_iterator(zend_class_entry
 	ZVAL_OBJ_COPY(&iterator->it.data, zobj);
 	iterator->it.funcs = &zend_hooked_object_it_funcs;
 	iterator->by_ref = by_ref;
-	iterator->declared_props_done = false;
 	zend_array *properties = zho_build_properties_ex(zobj, true, false);
 	ZVAL_ARR(&iterator->declared_props, properties);
 	zho_dynamic_it_init(iterator);

WDYT? This may be for an other PR however.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could simplify how we chose between declared and dynamic props. E.g. use declared ones when declared_props.nInternalPointer < declared_props.nNumUsed, and dynamic ones otherwise.

This would make more sense now that we always compute the properties hashtable. But we can do this in a separate PR.

if (UNEXPECTED(Z_TYPE(bucket->val) == IS_UNDEF)) {
return;
}

if (hooked_iter->by_ref && Z_TYPE(bucket->val) != IS_REFERENCE) {
ZEND_ASSERT(Z_TYPE(bucket->val) != IS_UNDEF);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assert can be removed now.

if (hooked_iter->dynamic_prop_it != (uint32_t) -1) {
EG(ht_iterators)[hooked_iter->dynamic_prop_it].pos = zho_num_backed_props(Z_OBJ(iter->data));
}
ZEND_ASSERT(hooked_iter->dynamic_prop_it != (uint32_t) -1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is no longer assigned -1 anywhere, so this assert is kinda useless.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right

arnaud-lb added a commit that referenced this pull request Oct 3, 2024
@arnaud-lb arnaud-lb closed this in c9dfb77 Oct 3, 2024
arnaud-lb added a commit that referenced this pull request Oct 3, 2024
* PHP-8.4:
  [ci skip] NEWS for GH-15960
  Deny resetting an object as lazy during property iteration
  Ensure to initialize lazy object in foreach
  Do not null out obj->properties when resetting object
  Fix handling of undef property during foreach by ref on hooked class
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants