[buddypress-trac] [BuddyPress Trac] #6789: XProfile: do not store serialized arrays for multi-value profile field data
buddypress-trac
noreply at wordpress.org
Thu Mar 1 03:01:56 UTC 2018
#6789: XProfile: do not store serialized arrays for multi-value profile field data
------------------------------+-------------------------------------
Reporter: Offereins | Owner:
Type: enhancement | Status: new
Priority: normal | Milestone: Awaiting Contributions
Component: Extended Profile | Version:
Severity: normal | Resolution:
Keywords: |
------------------------------+-------------------------------------
Comment (by boonebgorges):
I've attached a new patch, with the following changes:
1. Updated after [11866].
2. Adds a unit test for the behavior of caching multi-value fields in
`get_data_for_user()` - the test helped me understand what you did, so
let's throw it in there
3. Adds a fix for a bug that came up during this test: I've removed the
`$this->exists()` check from the `if` block in the `save()` method (see
line 260 of [attachment:6789.06.patch]). When setting the data via
`xprofile_set_field_data()`, the multi-value logic in that block was not
kicking in because the generated user in the test didn't yet have the data
set (ie, `$this->exists()` failed). It seems to me that the only important
part of the `exists` check is to delete the existing value, so I moved it
down a couple of lines. This could use a review from you, though, as you
may understand all these if/else statements better than I do.
=== Some responses to your comments ===
> I find it plausible that a custom profile field would store its data as
a serialized array.
Gotcha. I think this seems fine, especially if you yourself have done
something close to this. I just wanted to hear someone else confirm that
we weren't overengineering :)
> Do you prefer using the first value's key or the second?
Standardizing on the first seems to make sense to me, though it's mostly
arbitrary. One very small argument: if you ever did a `$wpdb->get_row(
"SELECT * FROM ..." )` query to fetch this data, you'd get the row with
the lowest `id`.
> Concerning your last point, I'm not quite sure I understand your
meaning of an optional get_data() method.
The way I see it, it was a mistake that serialized data was ever
available. I understand why serialization was/is important for *storage*,
but that should've been totally invisible to external functions and/or
other classes. However, as a matter of historical fact, the `$value`
property (I said `$data` above but I meant `$value`, sorry!!) might
contain serialized data. So I think we need to keep this behavior for
backward compatibility. At the same time, this would be a good time for us
to compartmentalize serialization in our standard treatment. So, something
like this:
{{{#!php
// In `BP_XProfile_Data::populate()`:
...
$raw_value = is_array( $profiledata->value ) ? array_map( 'stripslashes',
$profiledata->value ) : stripslashes( $profiledata->value );
$this->raw_value = $raw_value;
$this->value = maybe_serialize( $raw_value );
...
// A new method in `BP_XProfile_Data`:
public function get_value() {
return $this->raw_value;
}
}}}
Then, anywhere in BP, we would no longer reference the `$value` property
directly, only `get_value()`. (Probably something parallel for
`set_value()`?) But, now that I'm looking through the codebase, I don't
actually see anywhere where we ever internally reference the `$value`
property of a `BP_XProfile_ProfileData` object; pretty much anywhere where
we fetch user profile data, it's done through a direct SQL query that is
never actually turned into a `BP_XProfile_ProfileData` object.
So, maybe there's no point in doing what I've suggested here, at least not
until we come to a point where we rethink some of the internals. If you
agree, you can ignore my comment :-D
=== Next steps ===
To move forward, I think the next step is to think about data migration.
We don't currently have a framework for migration; see #6841. Setting
aside the details of that framework for a moment, we should think about
whether we need some sort of internal backward compatibility layer for
unmigrated data as part of an initial release. Say this change is part of
v3.1. Sites running v3.0 will have serialized data for multi-value fields.
3.1 will include a migration routine that'll move the data to the new non-
serialized format. But what if I upgrade to 3.1 but have not yet run the
3.1 migration routine (either because it's asynchronous, or because it
requires manual intervention, or both)? Queries like the one in
`get_data_for_user()` will fail for multi-value fields - they'll return a
value like `array( '{a:some serialized array}' )`.
What can we do to mitigate this? It seems to me that we can't have
fallback support for non-migrated, serialized data, because - as you note
in your comment above - multi-value fields might have serialized data for
*one* of the multi-values (so `is_serialized()` will not be a reliable
check for non-migrated data). It might be that there's nothing we can do
about it, and we'll just need to inform admins that such fields may not
work correctly until the migration has been run - but it's something we
should think about a bit.
(Sorry for the long comment - trying to be thorough!!)
--
Ticket URL: <https://buddypress.trac.wordpress.org/ticket/6789#comment:20>
BuddyPress Trac <http://buddypress.org/>
BuddyPress Trac
More information about the buddypress-trac
mailing list