Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explicit fo #407

Merged
merged 10 commits into from
May 5, 2018
Merged

Explicit fo #407

merged 10 commits into from
May 5, 2018

Conversation

mikechristie
Copy link
Collaborator

@mikechristie mikechristie commented Apr 30, 2018

Add back explicit failover support. It requires the ceph-iscsi-config changes here:

ceph/ceph-iscsi-config#54

@mikechristie
Copy link
Collaborator Author

One note. I left implicit failover support, but I am planning on removing it when I figure out if we need to make the explicit failover operation execute in a different thread than the cmdproc one, because that would use the same locking/threading as implicit support.

@dillaman
Copy link
Collaborator

dillaman commented May 1, 2018

@mikechristie We also need to consider the upgrade path from implicit configurations -> explicit. How would an initiator detect that it should use explicit ALUA on an active session? Would we have to upgrade and restart tcmu-runner's across all gateway nodes prior to upgrading ceph-iscsi-config?

case -ENOENT:
/* No lock holder yet */
break;
case -ESHUTDOWN:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: seems to be leaking RBD-specific error codes back to the generic handler. Could you just re-use the existing lock state enums (a la TCMUR_DEV_LOCK_FENCED)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going to use the TCMUR_DEV_LOCK codes like suggested above.

Note off list I said I was going to use the TCMU error codes being added in 403, but that is not done yet. Will convert to them later.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I flip flopped again on this. In this PR I left the -Exyz codes. In this patchset

#413

I went and made all the core and callout IO path code including above use common TCMU_ status codes.

rbd.c Outdated
@@ -74,6 +74,10 @@
#endif
#endif

#define TCMU_RBD_LOCKER_TAG_KEY "tcmu_rbd_locker_tag"
#define TCMU_RBD_LOCKER_TAG_FMT "tcmu_tag=%hu,rbd_client=%s"
#define TCMU_RBD_LCOKER_BUF_LEN 256
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: typo

rbd.c Outdated
metadata_owner = strstr(buf, "rbd_client=");
if (!metadata_owner) {
tcmu_dev_err(dev, "Invalid lock tag %s.\n", buf);
/* Force initaitor to retry STPG */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: typo

tcmu_dev_dbg(dev, "owner mismatch %s %s\n", metadata_owner,
owners[0]);
/*
* A node could be in the middle of grabbing the lock or it
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any concern about this potential race? I had originally thought of having each target register a metadata key of something like tcmu_target_group_<group id>=<client #> just after opening the image. Then, once the lock is acquired, the owner is already in the format of client.<client #> so you would just need to query the 2 - 4 metadata tags for each group to find the matching client id. Your solution is nice since it only involves a single metadata key lookup.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... apparently never mind. I thought that the librbd API provided the client name but it looks like it only provides the IP address of the owner. I guess that will be something to fix when I add shared lock support (along w/ the ability to store user-provided tags on the lock).

rbd.c Outdated
if (tcmu_rbd_has_lock(dev) != 1)
return TCMUR_DEV_LOCK_UNLOCKED;

ret = rbd_metadata_remove(state->image, TCMU_RBD_LOCKER_TAG_KEY);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible this could race w/ another client stealing the lock? Perhaps just leave it set to the old value and let the mismatch code above detect that it's out-of-date?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok sounds good. I was thinking if another client took the lock we would be blacklisted by the time we got here. But yeah your suggestion is nicer because we have that other mismatch code and then it saves the worry here. Will drop.

Mike Christie added 10 commits May 5, 2018 07:21
The next patches will not want infinite retries during
STPG handling. Make this configurable.

Signed-off-by: Mike Christie <[email protected]>
In this first simple pass we will run tcmu_acquire_dev_lock from
the STPG handler. In this case we cannot and do not want to
flush the ring because we would end up waiting on ourself.

Signed-off-by: Mike Christie <[email protected]>
Add more lock states.

Signed-off-by: Mike Christie <[email protected]>
tcmu_acquire_dev_lock should only try to reopen if the
device is fenced. If we cannot reach the cluster due to
a network issue then there is no point in trying to do
a reopen.

Also just return a new lock state instead of having the
extra TCMUR_LOCK return value.

Signed-off-by: Mike Christie <[email protected]>
Add back explicit failover support.

Signed-off-by: Mike Christie <[email protected]>
We need to report the remote ports alua state, but the rbd handler
is not able to notify remote nodes when a lock has been taken.
This has handlers like this set the alua active optimized
alua group when the lock is taken. When a RTPG is sent we then
get this info and update our local state.

Signed-off-by: Mike Christie <[email protected]>
This has rbd get/set the lock tag in the image metadata.

It also stores the current locker with the tag so if we should fail
during the lock,metadata_set operation we will know if the tag
is valid.

Signed-off-by: Mike Christie <[email protected]>
get_lock_state is no longer used, so remove it.

Signed-off-by: Mike Christie <[email protected]>
@mikechristie
Copy link
Collaborator Author

@mikechristie We also need to consider the upgrade path from implicit configurations

I am still working on upgrade instructions. In some cases it is going to be messy because the switch in supported failover types is not automatically detected. It will take some manual commands.

@mikechristie mikechristie merged commit 91f4a73 into open-iscsi:master May 5, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants