Restricted access to documents and images (member only files)
If users have a direct link to a file, they will have the ability to download it (i.e. page permissions are not inherited by the files on those pages)
Restrict folders and the files in those folders by membership/administrative/group level
I'm told that WA programmers are working on this issue and it is indeed considered an important one. Am looking forward to a simple secure solution, as our own users' confidence is an important to our group relationship as WA's user confidence is to the business relationship.
Meanwhile, I have learned a bit more and it might at least help others repair the damage. From here on, I hope that such guidance will come from Wild Apricot, not requiring our own prospecting.
It's likely that Google found its way to our URLs via a page that had been retired but for some reason no longer secure. (Those club members involved are involved in such things no longer). Expedited by Support's guidance, we made the page secure, and renamed it so that it will no longer be found. However, simply renaming a page isn't a solution, as the WA system automatically sets up a redirect.
Support says it’s because the system always remembers any page’s original unique base URL (e.g., “page-1234567”). Here’s the workaround whenever a redirect is NOT wanted:
1. After renaming a page, make a copy of it (Duplicate Page).
2. Delete the original page
3. Delete the word “copy of” (or whatever” from the new page’s Title and URL, so that both are same as the original.
If the page is restricted to Members or Admins, you're probably okay now. But for good measure now that you've also hidden the page, and are sure there are no links to it (be sure it's not on any menu!), advise users not to go to the page by typing its URL into a search engine! URLs belong on a browser's Address line only! (This is yet another reason why security based soley on an obscure URL is no security at all.)
So now, to undo the damage -- how to remove pages from Google Search Engine Result Pages and cache display. (Again, this is only Google -- an unsecure document may have been cataloged by any number of search engines and Internet archive sites.)
Pardon me if I save a few minutes by pasting from our own internal advisory, which will echo some advice already given in this thread:
Go to: https://www.google.com/webmasters/tools/url-removal
You'll need to be authenticated as a webmaster for the domain. There are various ways to do this, the easiest being to log in at your domain name registrar via Google's link. Some of the other methods (e.g. uploading a .txt file to root) are apparently not possible at WA. (Google will immediately show that you have been authenticated, but allow a short time for your authentication to be displayed among Google's selection list.)
Click "Temporarily hide" and enter the full URL. For example: http://www.yourdomainname.com/Resources/MemberContent/SecureFolder/Securesubfolder/sensitivedoc.pdf
Click Continue, then Submit Request.
Google will remove the doc from its own Search Engine Results Pages (SERPs), but still might retain its content in its database. Never put anything online that you absolutely would not want to become public. (Eg., your bank password or the recipe for Coca-Cola)
To determine the page's exact, full URL if not known: Copy from the browser's address tab. Sometimes a PDF fill will instead open in Acrobat or ask that you save it to your drive. In that case, the URL isn't displayed. Some ways to find it (one or another should work):
Find the page in Google's results. For example search for site:yourdomainname.com pdf
Copy the displayed URL. If the URL is too long to be fully displayed, it will containt "/.../" and you cannot use this method.
So instead, click on the displayed link (right click to open in a new tab or window). If the document/page opens in a browser (or if not found, but the URL is displayed), copy the URL.
If the document wants to be opened in other than your browser, click on the down-carrot arrow in the listing. Choose "Cache". Copy the source URL from the top of the cache-display page.
This removal request is good for only 90 days. Within that time, you'll need to rename the document(s) so that any incoming links are broken, or remove the document(s) from the server entirely. Note that the server must return a "Page Not Found" Error 404 to Google, or Google might again assume it is technically online and might display the cached version.
There are ways to prevent a page from being spidered, but many of them are purely voluntary, and some spiders and bad guys will ignore them. Also, some of those methods are not available to us at Wild Apricot, or cannot be implemented with a non-HTML file (e.g, a PDF).
This is only for Google. If found in other search engines, they may have their own procedures, maybe not.
I'm almost embarrassed that I didn't find Google's Removal page much sooner. It seems pretty easy now that I know. But then, Einstein's non-mathematical thought experiments are pretty simple, too, once you know about them. Anyway, once discovered, the process is not so hard.
But prevention is way easier.
This requirement, like user-accessible backup/restore, is table stakes for any business offering web services for money.
So, a requirements spec is quite simple: look at any other site (ie Dropbox) and implement what they have. Quite simply put, give public/members and administrators appropriate access (no access, read,or read/write/create access) to each file.
With this number of votes, why does this requirement continue to fall to the bottom of the stack?
Dawn Daehn commented
I echo all of Randall's comments and concerns. Over the past 20 years I've worked on various publishing systems: they all have a built-in security system or functionality to create one. Apache servers use a .htaccess file for access. I am not a server admin and know nothing about WA infrastructure, but if WA is using Apache servers, it seems this could be a possibility.
Hacks, such as using Google Docs, Dropbox, etc, are not acceptable. These fragment our editing/publishing process and make it more difficult.
Our organization has sensitive information we want to keep hidden from the public. Currently, our content is in WA and sensitive files are hosted on an Apache file server to preserve security. The additional server costs extra money. Not ideal. I hope WA comes up with a solution soon.
I know nothing (well, not enough to be useful) about servers; I've always specified Apache for the shared-server sites I've run. But I'm kinda getting the impression that WA is running your own proprietary OS or something. I can't imagine file security not being built into any major server OS.
Anyway, maybe there's a workaround for now ... is there a way we can encrypt our PDF files before or after uploading, and have your system decrypt them based on a simple password? That might not keep out the Kremlin, but would keep spiders from publishing anything useful.
I wouldn't expect our people to start encrypting files before uploading them, expecting Members to download and decrypt them, but some browsers (or your system?) insist that PDF files be downloaded before viewing anyway (bummer!), so this might also be an option. Pretty clunky, but better than distributing our financial data and such publicly. Suggestions as to third-party software for this?
Note however, we also store some PDF files that we WANT people to see. For example the details of attending a certain event or excursion.
Currently if a person or spider knows or guesses a file's exact URL, they can read it. Even if that file can be otherwise accessed by users only by logging in and navigating to it.
This is ABSOLUTELY CRITICAL. We can't post our meeting minutes to Members without them showing up in Google? And even if we then change the file names in order to break SERP links, the files still show up in Google's cache.
Measures recommended by Google and WA are totally inadequate because:
They are on the honor system (e.g. robots.txt, DoNotIndex meta tags, etc.)
We can't predict all spiders. There are more Search Engines than Google.
We can't even implement those, because we do not have access to our own robots.txt or page headers.
Apparently Google no longer takes "manual" requests to be removed from display.
Meta tags are irrelevant where PDF files are concerned.
I don't understand why WA can't implement this (currently it's not even on their work-in-progress agenda?); it should have been integral with their product from day one? What organization does not have documents they need to circulate only to members, or store centrally, or make available to admins ... without making them public?
I hear you, Randall. I feel the same pain about this issue, but I still cannot promise any date.
I see references to a "design solution" in this thread, but they date back to Sept 2015, and it's not on your current Road Map. Currently you seem to be saying that truly restricted, secure page/file access is not expected in the foreseeable future. Can you explain?
And sorry to air the following laundry in public, but this is a critical issue and has been an issue too long...
Some aspects of WA are very nice and the concept of consolidating admin activities is sound (even if some aspect of WA itself aren't exactly "consolidated" yet), but having to repeatedly train our frequently revolving admin users in your quirks, limitations, and workarounds, ... and the cost for what we're getting ... makes it difficult to keep justifying our staying here. (It's a group decision.) One thing that makes it difficult is that we'd enjoy more data security with a $15/mo hosting account, spreadsheets and a $60 SSL cert. Please don't make our decision easy!
This should be Development Priority One. That our organizations private documents cannot be secured without turning them into web pages is absolutely crippling. (Not to mention somewhat angering if there is no warning from WA during uploads. I didn't upload them, so I don't know about that.)
There are some measures that can be taken, but they're far short of a solution. Documents can be specified (as partial names) in robots.txt, and SE's can be asked not to catalog or display them, but this relies on spiders' polite behavior, which is hardly universal. Renaming our documents and/or their URLs might break the links, but that would probably be only temporary. As WA doesn't automatically provide traffic logs (nor apparently have access to them), we don't know what spiders have visited. And support could not tell me how Google discovered our documents. (Oddly, Google has cataloged only one particular type of document, and we don't yet know why. Hopefully that will provide a clue or even a solution.)
Repeat, if this protection is implemented by next weekend, it would still not be soon enough. Every organization should be able to exchange and post files without them becoming public.
We need a way to restrict access to files, folders, and documents so that they can only be accessed and downloaded by people who are logged in current members.
Skip Reddy commented
This feature functionality is extraordinarily important. When I asked about this feature in a support ticket, I was directed to another item.
I reiterate that this should absolutely be considered a core feature. I also would like to know how many votes does this actually need to be implemented.
Dawn Daehn commented
As mentioned by others, file protection should be a core feature.
The whole purpose of our organization switching to Wild Apricot was to consolidate web content, a blog, and a forum in one user-friendly system. If file protection isn't possible soon, I'll need to look at using another source or keep paid subscriptions for two web hosting companies.
Please move file protection to the top of the list. How long will it take?
I understand the frustration, Colin and sorry to hear that you left.
Colin Shead commented
We have been waiting for this rather basic 'core feature' for a very long time. Because of this, other poorly implemented features and rising costs, my organisation has decided to leave Wild Apricot, and have developed an alternative an 100% secure solution of our own to which we are migrating.
All the best
Russell Noble commented
So how many votes does this actually need to be implemented.
Member only documents is a pretty core feature.
Scott Hendison commented
Please implement the suggested bugfix already... A basic function of a membership site should be to protect "members only" content.
Tiffany Trusty commented
We really need a secure document repository solution! You got me out of spreadsheet hell, now get me out of SharePoint / DropBox / GoogleDocs purgatory!
As an administrator of a WA site that requires this feature, I believe it's necessary to investigate other hosting options if this feature isn't implemented fairly soon. As a developer, I have a very hard time understanding why it wasn't already implemented.
P.S: Please consider creating a version of this support website that is managed by the Wild Apricot software. Keep the WA version internal if you must. Regardless, I think the experience would give you some valuable insights into your customer's experiences and requirements. (And if you can make it robust enough to serve as the public website, it would probably help you a lot in selling your product.)
Rich - no, we usually do not provide one.
Looks good. Do we have any ETA?