Skip to main content

Pruebas A/B de diseños de Captive Portal para una mayor conversión de registro

Esta guía de referencia técnica proporciona una metodología paso a paso para realizar pruebas A/B estadísticamente válidas en diseños de Captive Portal. Cubre cálculos de tamaño de muestra, planificación de la duración de la prueba e interpretación de resultados para impulsar una mayor conversión de registro de WiFi para invitados para operadores de recintos y equipos de TI.

📖 6 min de lectura📝 1,311 palabras🔧 2 ejemplos3 preguntas📚 8 términos clave

🎧 Escucha esta guía

Ver transcripción
Welcome to the Purple Intelligence Briefing. I'm your host, and today we're tackling a topic that sits right at the intersection of network operations and commercial performance: how to run statistically valid A/B tests on your captive portal designs to drive higher guest WiFi sign-up rates. Whether you're managing a hotel estate, a retail chain, a stadium, or a conference centre, your captive portal is the front door to your first-party data strategy. And yet, most organisations deploy a single portal design and leave it running indefinitely — never testing, never optimising. That's the equivalent of running a single version of your website homepage for five years without ever looking at the analytics. Today, we're going to change that. Let me set the scene. The average unoptimised captive portal in a hospitality environment converts somewhere between 22 and 30 percent of connecting devices into registered profiles. After a structured A/B testing programme, that figure typically rises to between 40 and 52 percent. That's not a marginal improvement — that's a near-doubling of your first-party data acquisition rate, which has direct implications for your CRM pipeline, your marketing automation workflows, and ultimately your revenue per guest. So, let's get into the technical methodology. The first thing to understand is what we're actually testing. A captive portal A/B test is a controlled experiment where you split incoming WiFi users into two or more groups — each group sees a different version of your splash page — and you measure which version produces a higher sign-up completion rate. The key word here is "controlled." This is not a sequential test where you run version A for a month, then version B for a month. That approach is fundamentally flawed because it confounds your results with seasonal variation, footfall changes, and event calendars. You need concurrent, randomised assignment. Most enterprise WiFi platforms — including Purple — support multi-variant portal configuration, which means you can serve different portal designs simultaneously from the same SSID. The platform handles the randomised assignment, typically using a hash of the device MAC address or a session token to ensure each user sees the same variant consistently across their session, while the overall split remains close to 50-50. Now, let's talk about the single most important concept in any A/B test: statistical significance. This is where most organisations go wrong. They run a test for a week, see that variant B has a higher conversion rate, declare it the winner, and deploy it. But without sufficient sample size, that result is almost certainly noise. Here's the framework you need to apply. Before you start any test, you must define three parameters. First, your baseline conversion rate — that's your current portal's sign-up rate, which you should already have from your WiFi analytics dashboard. Second, your minimum detectable effect, or MDE — this is the smallest improvement you actually care about. If your baseline is 28 percent, you might decide that a 5 percentage point improvement is the minimum worth acting on. Third, your confidence level — the industry standard is 95 percent, meaning you accept a 5 percent probability of a false positive. With those three inputs, you can calculate your required sample size per variant using the standard formula: n equals Z-squared multiplied by p times one minus p, divided by MDE-squared. For a baseline of 28 percent, an MDE of 5 percentage points, and 95 percent confidence, you need approximately 2,800 unique visitors per variant. That means 5,600 total sessions before you can draw any conclusions. Now translate that into calendar time. If your venue sees 800 unique device connections per day, you're looking at a minimum of 7 days. But here's the critical nuance: you should never run a test for fewer than two full business cycles, regardless of whether you've hit your sample size target. A "business cycle" in this context means the repeating pattern of your footfall — for a hotel, that's typically a full week to capture both leisure and business travellers. For a retail environment, it might be two weeks to capture both weekday and weekend shopping patterns. For a stadium, it means running the test across multiple comparable events. Why does this matter? Because day-of-week effects are real and significant. A portal test that runs only Monday to Friday in a business hotel will over-represent corporate travellers and under-represent leisure guests. Your winning variant might perform brilliantly for one segment and poorly for the other. Running across full cycles averages out these effects. Let me give you a concrete example from the hospitality sector. A regional hotel group with 12 properties wanted to increase their guest WiFi registration rate to improve their direct booking programme. Their baseline portal had a 26 percent sign-up rate. They were using a three-field form — name, email, and room number — with a generic "Connect to WiFi" call to action. They structured an A/B test with two variants. Variant A was their existing design — the control. Variant B reduced the form to two fields — email and room number only — and changed the call to action to "Access Free High-Speed WiFi." They also added a single line of value proposition copy: "Stay connected and receive exclusive member offers." The test ran for 21 days across all 12 properties, accumulating 34,000 unique sessions. Variant B achieved a 41 percent sign-up rate against variant A's 26 percent — a 15 percentage point lift, well above their 5 percentage point MDE threshold, with a p-value of less than 0.001. The result was unambiguous. What drove the improvement? Post-test analysis pointed to two factors. First, reducing form fields from three to two lowered the perceived friction significantly. Research in conversion rate optimisation consistently shows that each additional form field reduces completion rates by approximately 11 percent. Second, the revised call to action addressed the user's immediate motivation — fast, free connectivity — rather than the brand's motivation, which was data capture. Now let's move to the retail environment. A shopping centre operator managing a 140-unit mall wanted to improve their WiFi sign-up rate to feed their footfall analytics and tenant marketing platform. Their baseline was 19 percent — lower than hospitality, which is typical for retail because the dwell time is shorter and the perceived need for WiFi is lower. They ran a three-variant test — what's sometimes called an A/B/C test. Variant A was their control: a standard email-and-name form with a "Sign In" button. Variant B replaced the form with a single-click social login via email — "Continue with Google" or "Continue with Apple." Variant C used a single email field with the copy "Get 10% off your next purchase at participating stores — enter your email to connect." After 28 days and 62,000 sessions, the results were striking. Variant B — social login — achieved 34 percent conversion, a 15 percentage point lift. Variant C — the discount incentive — achieved 31 percent. Variant A remained at 19 percent. The operator deployed Variant B as the primary portal but retained Variant C as a seasonal overlay during promotional periods. The key learning here is that in low-dwell environments, reducing authentication friction is more impactful than adding incentives. Social login removes the cognitive load of entering credentials on a mobile keyboard, which is the primary barrier in retail settings. Now, let me address some common implementation pitfalls. The first is novelty effect bias. When you launch a new portal design, there's often a short-term spike in engagement simply because it looks different. This is why your warm-up period — the first three days of a test — should be excluded from your analysis. Only count data from day four onwards. The second pitfall is running too many variants simultaneously. It's tempting to test five or six design changes at once to accelerate learning. But each additional variant dilutes your traffic, extends the time needed to reach statistical significance, and makes it harder to attribute results to specific changes. Unless you have very high traffic volumes — above 5,000 daily sessions — stick to two variants per test. The third pitfall is ignoring GDPR compliance in your test design. Every variant you test must meet your data protection obligations. If you're testing a variant that requests additional personal data fields, you need to ensure that the consent mechanism is equally prominent in both variants. Running a test where variant A has a clearly visible privacy notice and variant B buries it in small print will produce a conversion lift that you cannot legally exploit. Your legal team should sign off on every portal variant before it goes live. The fourth pitfall is what I call "winner's curse" — deploying a winning variant without understanding why it won. Always conduct a post-test analysis that segments your results by device type, time of day, and visitor segment where possible. A variant that wins on mobile may underperform on desktop. A variant that wins during peak footfall may struggle during quiet periods. Understanding the mechanism of improvement makes your next test smarter. Now, a rapid-fire round on the questions we get asked most frequently. "How long should my test run?" Minimum two full business cycles, never fewer than 14 days, and only after hitting your minimum sample size. If you haven't hit sample size after 30 days, your traffic is too low to run a valid test — consider pooling data across multiple sites. "What's the most impactful element to test first?" Call-to-action copy, consistently. It has the highest impact-to-effort ratio and takes less than an hour to implement. Start there before touching form fields or visual design. "Can I test on a single site?" Yes, but with caveats. Single-site tests are valid if you have sufficient traffic. If your site sees fewer than 300 unique daily connections, you'll need 30 or more days to reach significance — at which point seasonal drift becomes a real concern. Multi-site testing, where the same variants run across comparable venues simultaneously, is the more robust approach. "What about multi-variate testing?" MVT — multi-variate testing — allows you to test combinations of changes simultaneously. It's more efficient than sequential A/B tests but requires significantly more traffic. As a rule of thumb, you need at least 1,000 daily sessions per variant combination. For most venue operators, sequential A/B testing is the right starting point. To summarise the key principles from today's briefing. One: always calculate your required sample size before launching a test — never declare a winner on gut feel. Two: run tests for at least two full business cycles, regardless of early results. Three: test one element at a time, starting with call-to-action copy. Four: exclude the first three days of data to eliminate novelty effect bias. Five: ensure every variant is GDPR-compliant before deployment. Six: segment your post-test results by device type and visitor cohort to understand the mechanism of improvement. If you're operating on Purple's platform, the multi-variant portal capability gives you the infrastructure to implement everything we've discussed today without additional development overhead. The analytics layer provides the session data you need for sample size tracking, and the portal builder supports concurrent variant deployment from a single management console. Your next step is straightforward: pull your current portal's sign-up rate from your WiFi analytics dashboard, set a 5 percentage point MDE as your target, calculate your required sample size, and design your first variant with a revised call-to-action copy. You can be running a statistically valid test within 48 hours. Thank you for joining the Purple Intelligence Briefing. If you found this useful, explore our guides on event-driven marketing automation and WiFi-triggered email workflows — links in the show notes. Until next time.

header_image.png

Resumen Ejecutivo

Para los operadores de recintos empresariales, el Captive Portal es el punto de ingesta crítico para los datos de invitados de primera parte. Sin embargo, muchas organizaciones implementan una página de bienvenida estática y la dejan funcionando indefinidamente, ignorando el aumento sustancial de la conversión posible a través de la experimentación estructurada. El Captive Portal promedio no optimizado en un entorno de hostelería o minorista convierte entre el 20% y el 30% de los dispositivos conectados en perfiles registrados. A través de rigurosas pruebas A/B de elementos de diseño, flujos de autenticación y propuestas de valor, las organizaciones pueden aumentar de manera confiable esta línea base al 40%-50% o más.

Esta guía proporciona una metodología integral para estructurar, ejecutar y analizar pruebas A/B en diseños de Captive Portal. Va más allá de los ajustes básicos de diseño para abordar el rigor estadístico requerido para obtener resultados válidos, específicamente cálculos de tamaño de muestra, planificación de la duración de la prueba y la mitigación de errores experimentales comunes como el sesgo de novedad. Al aprovechar plataformas que admiten portales multivariante, como la solución Guest WiFi de Purple, los equipos de TI y marketing pueden transformar su red de invitados de un centro de costos en un motor de adquisición de datos de alta conversión.

Análisis Técnico Detallado: La Mecánica de las Pruebas de Captive Portal

Una prueba A/B de Captive Portal es un experimento controlado donde el tráfico WiFi entrante se divide de manera aleatoria y equitativa entre dos o más variaciones de una página de bienvenida. El objetivo es identificar qué variación produce una mayor tasa de autenticaciones exitosas (el evento de conversión).

Enrutamiento de Tráfico y Persistencia de Sesión

Para mantener la validez experimental, la infraestructura de prueba debe asegurar la persistencia de la sesión. Cuando un usuario se conecta al SSID y es interceptado por la puerta de enlace, el servidor radius o el controlador en la nube le asigna una variante específica (por ejemplo, Variante A o Variante B). Esta asignación se maneja típicamente a través de un hash de la dirección MAC del dispositivo. Es fundamental que si el usuario se desconecta y se vuelve a conectar durante el período de prueba, se le sirva exactamente la misma variante que vio inicialmente. La falta de mantenimiento de esta persistencia contamina los datos, ya que los usuarios expuestos a múltiples variantes no pueden atribuirse limpiamente a ninguna de ellas.

Significación Estadística y Efecto Mínimo Detectable (MDE)

El modo de fallo más común en las pruebas A/B es terminar el experimento prematuramente. Observar una tasa de conversión más alta en la Variante B después de tres días no garantiza un diseño ganador; puede ser simplemente ruido estadístico. Para asegurar que los resultados sean confiables, los equipos deben calcular el tamaño de muestra requerido antes de que comience la prueba.

El cálculo requiere tres entradas:

  1. Tasa de Conversión Base ($p$): La tasa de registro actual de su portal existente, obtenible a través de su panel de control de WiFi Analytics .
  2. Efecto Mínimo Detectable (MDE): La mejora relativa o absoluta más pequeña que justifica el costo operativo de implementar el nuevo diseño. Para los Captive Portals, un MDE absoluto de 5 puntos porcentuales es estándar.
  3. Significación Estadística ($lpha$): La probabilidad de rechazar la hipótesis nula cuando es verdadera (un falso positivo). El estándar de la industria es del 95% ($lpha = 0.05$).

sample_size_calculator_infographic.png

Usando la fórmula estándar para comparar dos proporciones, un recinto con una tasa de conversión base del 25% que busca una mejora absoluta de 5 puntos porcentuales con un 95% de confianza requiere aproximadamente 3,000 visitantes únicos por variante.

Consideraciones de Estándares y Cumplimiento

Al alterar los flujos de autenticación, las pruebas deben adherirse a los estándares de red subyacentes y a los marcos regulatorios.

  • IEEE 802.1X / EAP: Si se prueban métodos de autenticación sin interrupciones (como Passpoint/Hotspot 2.0) frente a SSIDs abiertos tradicionales con Captive Portals, asegúrese de que los registros de contabilidad de radius atribuyan correctamente la sesión a la variante.
  • Cumplimiento de GDPR / CCPA: Cualquier variante que altere los campos de recopilación de datos (por ejemplo, añadir un campo de número de teléfono) debe mantener mecanismos de consentimiento conformes. Una variante no puede "ganar" simplemente ocultando la política de privacidad.
  • PCI DSS: Si se prueban niveles de WiFi de pago, asegúrese de que las integraciones de la pasarela de pago permanezcan aisladas de la red corporativa principal.

Guía de Implementación: Estructurando Su Primera Prueba

La ejecución de una prueba estadísticamente válida requiere un enfoque disciplinado y neutral al proveedor. Siga este marco de implementación paso a paso.

Fase 1: Generación de Hipótesis y Diseño de Variantes

No pruebe cambios aleatorios. Cada prueba debe partir de una hipótesis clara. Por ejemplo: "Reducir el formulario de autenticación de tres campos (Nombre, Correo electrónico, Código postal) a dos campos (solo Correo electrónico) reducirá la fricción y aumentará la conversión en al menos un 5%."

Al diseñar variantes, concéntrese primero en los elementos de alto impacto. Como se muestra en el gráfico de impacto de conversión a continuación, los cambios en el texto de la Llamada a la Acción (CTA) y los campos del formulario producen rendimientos significativamente mayores que los ajustes menores de color.

conversion_impact_chart.png

Fase 2: Configuración y Control de Calidad

Configure las variantes dentro de su plataforma de gestión de Captive Portal. Asegúrese de que:

  • La división esté configurada al 50/50 para una prueba A/B estándar.
  • El seguimiento de análisis esté correctamente implementado en la página de éxito (la redirección posterior a la autenticación) para contar con precisión las conversiones.
  • Ambas variantes se prueben en múltiples tipos de dispositivos (iOS, Android, Windows, macOS) y navegadores (Safari, Chrome, mini-navegadores nativos de Captive Portal) antes del lanzamiento.

Fase 3: Ejecución de la Prueba ay Duración

Inicie la prueba, pero no monitoree los resultados diariamente. La revisión constante de los resultados conduce a un "sesgo de observación", lo que aumenta la probabilidad de declarar falsamente un ganador.

Ejecute la prueba durante un mínimo de dos ciclos comerciales completos (normalmente 14 días) para tener en cuenta las variaciones de afluencia según el día de la semana. Por ejemplo, un establecimiento de Hostelería observa diferentes perfiles demográficos un martes (viajeros corporativos) en comparación con un sábado (huéspedes de ocio). Incluso si alcanza el tamaño de muestra requerido el día 5, deje que la prueba siga su curso completo para asegurar que la variante ganadora funcione bien en todos los segmentos de audiencia.

Mejores Prácticas para Portales de Alta Conversión

Basado en datos agregados de implementaciones empresariales, los siguientes principios impulsan consistentemente tasas de registro más altas:

  1. Minimice la Fricción de Entrada: Cada campo adicional en el formulario reduce la conversión. Si solo necesita una dirección de correo electrónico para activar una Automatización de Marketing Orientada por Eventos Activada por la Presencia de WiFi , no solicite una fecha de nacimiento.
  2. Aproveche la Autenticación Social: En entornos de alto tráfico como centros de Transporte o centros Comerciales , ofrecer autenticación con un solo clic a través de Google, Apple o Facebook supera significativamente la entrada manual de datos, especialmente en dispositivos móviles.
  3. Redacción Orientada al Valor: Reemplace los CTA genéricos como "Conéctese a WiFi" con textos orientados al valor como "Obtenga Acceso de Alta Velocidad" o "Únase hoy para obtener un 10% de descuento".
  4. Optimice para el Mini-Navegador: El Captive Portal a menudo se carga en un mini-navegador restringido (CNA - Captive Network Assistant) en lugar de un navegador completo. Evite JavaScript complejo, videos de fondo pesados o fuentes web externas que puedan fallar al cargar o agotar el tiempo de espera a través de una conexión preautenticada.

Solución de Problemas y Mitigación de Riesgos

Cuando las pruebas no producen resultados accionables o impactan negativamente la experiencia del usuario, generalmente se debe a uno de estos modos de falla comunes:

Modo de Falla Causa Raíz Estrategia de Mitigación
Efecto Novedad Los usuarios recurrentes interactúan con un nuevo diseño simplemente porque es diferente, causando un pico inicial que regresa a la media. Descarte los primeros 3-4 días de datos de prueba (el período de "calentamiento") antes de calcular la significancia.
Tiempos de Espera de CNA La Variante B incluye activos pesados (imágenes/scripts) que tardan demasiado en cargar a través de la conexión de jardín amurallado, lo que provoca que el sistema operativo cierre el portal. Mantenga el peso total de la página por debajo de 500KB. Use fuentes del sistema y comprima todas las imágenes.
Atribución Contaminada Los usuarios que se desplazan entre puntos de acceso activan múltiples impresiones del portal, sesgando el recuento de visitantes. Asegúrese de que la plataforma de análisis deduplique las sesiones basándose en la dirección MAC dentro de un período de 24 horas.

ROI e Impacto Comercial

El caso de negocio para las pruebas A/B de Captive Portals es directo y altamente medible. Considere un fideicomiso de Salud o una gran propiedad minorista que registra 50,000 conexiones de dispositivos únicas al mes.

Si la tasa de conversión base es del 20%, el establecimiento captura 10,000 perfiles mensualmente. Al implementar un programa de pruebas que aumenta la conversión al 35%, el establecimiento captura 17,500 perfiles, un adicional de 90,000 perfiles anualmente sin aumentar la afluencia o el gasto en marketing.

Estos perfiles adicionales se integran directamente en los sistemas posteriores. Cuando se integra correctamente, como al usar Mailchimp Plus Purple: Marketing por Correo Electrónico Automatizado a partir de Registros de WiFi , esta audiencia expandida se traduce directamente en mayores tasas de participación, un aumento en los registros de programas de lealtad y un incremento medible en los ingresos.

Términos clave y definiciones

Captive Portal

A web page that a user of a public access network is obliged to view and interact with before access is granted.

The primary ingestion point for guest data in enterprise WiFi deployments.

Minimum Detectable Effect (MDE)

The smallest improvement in conversion rate that you care to measure and that justifies the cost of implementing the change.

Used before a test begins to calculate the required sample size. Setting an MDE too low requires impractically large sample sizes.

Statistical Significance

The mathematical likelihood that the difference in conversion rates between Variant A and Variant B is not due to random chance.

IT teams use a 95% confidence level to ensure they don't deploy a 'winning' design that was actually just a statistical fluke.

Walled Garden

A restricted environment that controls the user's access to web content and services prior to full authentication.

Crucial when testing social logins; the OAuth domains (e.g., accounts.google.com) must be whitelisted in the walled garden.

Captive Network Assistant (CNA)

The pseudo-browser that operating systems (like iOS or Android) automatically open when they detect a captive portal.

CNAs have limited functionality (no tabs, limited cookie support, aggressive timeouts). Portal designs must be tested specifically within CNAs, not just standard desktop browsers.

Session Persistence

The mechanism by which a user is consistently served the same variant of a portal if they disconnect and reconnect during the test period.

Essential for data integrity. Usually achieved by hashing the device MAC address to assign the variant.

Novelty Effect

A temporary spike in user engagement caused simply by a design being new or different, rather than inherently better.

Mitigated by discarding the first few days of test data to allow returning users to normalise their behaviour.

A/B/n Testing

An experimental framework where more than two variants (A, B, C, etc.) are tested simultaneously against a control.

Requires significantly higher footfall/traffic than standard A/B testing to reach statistical significance in a reasonable timeframe.

Casos de éxito

A 400-room business hotel currently uses a captive portal requiring Name, Email, and Room Number, achieving a 22% conversion rate. The marketing director wants to increase this to 30% to grow their loyalty database. They propose testing a new variant that adds a 'Company Name' field but offers a free coffee voucher upon sign-up. How should the IT manager structure this test?

The IT manager should structure a 14-day A/B test. Variant A (Control) remains the 3-field form. Variant B (Challenger) becomes the 4-field form with the coffee voucher offer. To detect an 8 percentage point lift (from 22% to 30%) at 95% confidence, they need approximately 1,100 unique visitors per variant. Given the hotel's occupancy, this will take about 10 days, but the test must run for 14 days to capture two full business cycles (weekday corporate vs. weekend leisure).

Notas de implementación: This scenario tests the balance between friction (adding a field) and incentive (the voucher). The IT manager correctly identifies the need for a full two-week cycle. Often, adding fields depresses conversion, but a strong enough incentive can overcome this friction. The test will definitively prove which force is stronger.

A large stadium with 60,000 capacity experiences severe network congestion during the 15-minute half-time interval. The current captive portal requires email verification via a magic link. Conversion is only 12%. The network architect wants to test a one-click 'Sign in with Apple/Google' variant. What are the specific technical constraints for this test?

The architect must configure the walled garden (pre-authentication whitelist) to allow traffic to Apple and Google's OAuth servers. Without this, the social login buttons will fail to load or authenticate. The test should be run across three consecutive match days to ensure sufficient sample size and to account for different fan demographics. The primary metric is not just conversion rate, but 'time-to-authenticate' to ensure the new method reduces DHCP lease holding times during the half-time rush.

Notas de implementación: In high-density environments like stadiums, captive portal design is as much about network throughput as it is about marketing. The architect correctly identifies that social login requires specific walled garden configurations. Measuring time-to-authenticate is a critical secondary metric for venue operations.

Análisis de escenarios

Q1. A retail chain runs a portal test for 5 days. Variant B shows a 45% conversion rate compared to Variant A's 30%. The marketing team wants to deploy Variant B immediately across all 50 stores. As the IT manager, what is your recommendation?

💡 Sugerencia:Consider the 'Two-Cycle' rule and the concept of business cycles in retail.

Mostrar enfoque recomendado

Do not deploy yet. Five days is insufficient because it does not cover a full business cycle (a full week including both weekdays and weekends). Retail footfall demographics change significantly between Tuesday morning and Saturday afternoon. The test must run for at least 14 days to ensure Variant B performs consistently across all shopper profiles, even if statistical significance appears to have been reached early.

Q2. You are testing a new portal design that includes a large, high-resolution background video to showcase a new hotel property. During the test, Variant B (the video version) shows a significantly lower conversion rate than the plain text Control, but network logs show high drop-off before the page even fully renders. What is the likely technical issue?

💡 Sugerencia:Consider the environment where captive portals load on mobile devices.

Mostrar enfoque recomendado

The high-resolution video is causing Captive Network Assistant (CNA) timeouts. CNAs on iOS and Android have aggressive timeout thresholds and limited resources. If the page weight is too heavy (e.g., a large video file) over the pre-authenticated walled garden connection, the OS will assume the network is broken and close the CNA window before the user can authenticate. The mitigation is to remove the video, keep page weight under 500KB, and re-test.

Q3. A venue wants to test changing the portal CTA from 'Sign In' to 'Join WiFi & Get Offers'. They also want to change the button colour from grey to Purple, and remove the 'Last Name' field. They propose launching this as Variant B. Why is this experimental design flawed?

💡 Sugerencia:Review the 'Test One, Learn One' memory hook.

Mostrar enfoque recomendado

This design violates the principle of isolating variables. By changing the copy, the colour, and the form length simultaneously in a single variant, the team will not know which specific change caused the outcome. If conversion increases, was it the shorter form or the better copy? The test should be restructured to isolate one variable (e.g., test the copy change first), or structured as a multi-variate test (MVT) if traffic volumes permit.